net.jini.lookup.ServiceDiscoveryManager
is a utility class that can be used to help a Jini technology-enabled
service or client (Jini service or Jini client)
acquire services of interest that are registered with lookup services
with which the service or client wishes to interact.
The specification for the
ServiceDiscoveryManager
utility is contained in the Jini Service Discovery Utilities Specification,
which is available in html.
A full list of supported configuration entries is given in this utility's class documentation.
Logger
named net.jini.lookup.ServiceDiscoveryManager
. For a description of the
information that is logged, as well as the associated logging levels, refer to the
class documentation.
lookup
method is
invoked, there are conditions in which a NullPointerException
can occur. One possible scenario in which this bug can occur is as follows:
SDM.lookup(duration)
SDM creates cache ----- RegisterListenerTask(wait for events)
LookupTask begins
get snapshot (lookup state)
-- done --
wait(duration)
*** start targetted service ***
NOMATCH_MATCH event received
NotifyEventTask (process event)
NewOldServiceTask (new service)
addToMap -- (item,null)
exit wait
call cach.lookup()
filterMaybeDiscard()
getServiceItems()
map(item,null) ===> NPE
filter -- map(item,filteredItem)
If the timing is right, getServiceItems
can be entered
before the filter is applied and the non-null
filteredItem
is placed in the map. If this occurs,
getServiceItems
will attempt to access
filteredItem.srvc
, which will result in a
NullPointerException
.
ServiceDiscoveryManager.LookupCacheImpl.notifyServiceMap
,
only the sequence number of the new event and the sequence number stored
in the EventReg
element of the eventRegMap
were compared when determining whether or not there is a "gap" in the event
sequence; in which case, a snapshot would be requested from the associated
lookup service. This could result in multiple unnecessary snapshots being requested; and thus, unnecessary network traffic and resource usage.
To understand how this problem could occur, consider the following scenario:
1 activatable lookup service registered with Phoenix (or RMID) 4 services registered with that lookup service 1 client using a cache to discover the services >>> phoenix -stop (or rmid -stop) to stop the lookup service >>> all 4 services eventually discard the lookup service >>> the client eventually discards the lookup service >>> restart phoenix (or rmid) before 30 minutes have expired (to restart the lookup service and recover its state before the client's original event registration [E0] has a chance to expire)When the client rediscovers the lookup service, its cache requests a new event registration [E1]. But the lookup service still has E0 in its state.
When the services rediscover the lookup service, they reregister with
the lookup service, and the lookup service sends a serviceAdded
(NOMATCH_MATCH
) event to both E0 and E1 for each of the
services. The cache's event listener receives both sets of events.
Prior to fixing this bug, notifyServiceMap
only analyzed the sequence numbers of the events. It did not distinguish which event
registration each event corresponded to. Because E0 was recovered after
the lookup service went down and was restarted, the sequence numbers
of the events sent to that registration were greater than
Integer.MAX_VALUE
, which resulted in a "gap" in the sequence so
that clients would know that they might have missed events. Because E1 was
not interrupted by a shutdown/restart, its sequence numbers started at 0. Thus, when serviceAdded events arrived for E0 and for E1, the sequence
looked something like the following to notifyServiceMap
:
event 0 --> Service0 added: Event ID = 0 - seq# 2147483650 event 1 --> Service0 added: Event ID = 1 - seq# 0 event 2 --> Service1 added: Event ID = 0 - seq# 2147483651 event 3 --> Service1 added: Event ID = 1 - seq# 1 event 4 --> Service2 added: Event ID = 0 - seq# 2147483652 event 5 --> Service2 added: Event ID = 1 - seq# 2 event 6 --> Service3 added: Event ID = 0 - seq# 2147483653 event 7 --> Service3 added: Event ID = 1 - seq# 3Because
notifyServiceMap
did not consider the event ID,
it appeared to notifyServiceMap
that it was receiving a
stream of events in which the sequence numbers looked like:
{ 2147483650, 0, 2147483651, 1, 2147483652, 2, 2147483653, 3 }To
notifyServiceMap
, the sequence appeared to be
alternately "moving backward" and "moving forward" with a large
gap. In each case though, notifyServiceMap
interpreted
the difference between sequence numbers as a gap; that is,
(if event_(n+1)_seq# != 1 + event_n_seq#, then a gap is declared).Thus, as each event was received,
notifyServiceMap
declared a gap, and a new snapshot was requested; which meant that for
the four services, there were eight total snapshots requested.
To address this bug, the method notifyServiceMap
was
modified to consider the event ID when determining whether or not
an event sequence contains a gap; which now results in the appropriate
behavior.
DiscoveryManagement
item (entry name = discoveryManager), a default
LookupDiscoveryManager
is first created. Although that manager is initialized to discover no groups and no locators,
the LookupDiscovery
instance used by the default discovery manager to perform group discovery creates
a thread to listen for and process multicast announcements, as well as
additional, related threads. Thus, if a deployer configures a discoveryManagement
item, the creation of the default lookup discovery manager -- and the threads
that manager ultimately creates -- is unnecessary, and wastes resources.
This bug has been fixed.
Note 1: Service Reference Comparison
The ability to appropriately compare two different service references
is very important to the
ServiceDiscoveryManager
in general, and the
LookupCache
in particular. Currently, there are three mechanisms used to accurately
compare service references. Each mechanism is applied in different situations,
to achieve different goals.
When storing and managing service references, it is important to be able
to determine when two different references (proxies) refer to the same
back-end service (such references are referred to as duplicates)
so that the storage and management of any duplicate references can be
avoided. In addition to identifying duplicate references, the ability to
determine when a previously discovered service has been replaced with a
new version is also important so that consistent state may be maintained, and entities that wish to know about such events can be informed.
Finally, when an entity wishes to discard (make eligible for rediscovery)
a particular service reference received from the
LookupCache
,
it is important for the LookupCache
to be able to compare the
reference provided by the entity to each of the previously stored references
so that the appropriate reference can be accurately selected for discard.
Comparison by net.jini.core.lookup.ServiceID
To identify, and thus avoid storing, duplicate service references, the
LookupCache
compares the instances of ServiceID
associated with each reference. Recall that an individual well-behaved service of
interest will usually register with multiple lookup services, and for
each lookup service with which that service registers, the
LookupCache
will receive a separate event containing a reference to the service. When the
LookupCache
receives events from multiple lookup services, the
ServiceID
(retrieved from the service reference in the event) is used to distinguish the
service references from each other. In this way, when a new event arrives containing
a reference associated with the same service as an already-stored reference,
the LookupCache
can determine whether or not the new reference is a duplicate; in which case, the
duplicate is ignored.
Comparison by net.jini.io.MarshalledInstance.fullyEquals
With respect to determining when a previously discovered service has been
replaced with a new version, the LookupCache
typically relies
on the event mechanism of the lookup service(s) with which that service is
registered to indicate that such an event has occurred. But there are
situations where the events from the lookup services do not provide enough
information to make such a determination. In those cases, the
LookupCache
employs
MarshalledInstance.fullyEquals
to make the determination.
When a well-behaved service is replaced with a new version, the new
version is typically reregistered with each lookup service with
which the old version is registered. As described in the
Jini(TM) Lookup Service Specification,
each lookup service with which this reregistration process occurs first
sends a service-removed event (TRANSITION_MATCH_NOMATCH
), and
then sends a separate service-added event (TRANSITION_NOMATCH_MATCH
).
In this case, there is no ambiguity, and thus no need for the
LookupCache
to compare the new and old service references. This is because the
combination of service-removed and service-added events from each
lookup service is an explicit indication that the service has been
replaced. Note that, as described in the specification, the lookup
service only sends a service-changed event (TRANSITION_MATCH_MATCH
)
when the attributes of the service have been modified; not when the
service itself has been changed (replaced). Thus, if the
LookupCache
receives a TRANSITION_MATCH_MATCH
event, then it is guaranteed
that the service referenced in that event has not been replaced with a new
version.
Whenever the following conditions are satisfied, the LookupCache
will use MarshalledInstance.fullyEquals
to compare two service references:
ServiceID
)
When determining whether or not these conditions (in particular, the
second condition) are satisfied, the LookupCache
generally
takes a conservative approach. That is, only when it is absolutely
sure that two duplicates refer to the same version (such as when a
TRANSITION_MATCH_MATCH
event is received), will the
LookupCache
refrain from using
MarshalledInstance.fullyEquals
to compare the duplicate references; otherwise, duplicate references are
always compared using
MarshalledInstance.fullyEquals
.
One example of a situation where the LookupCache
employs MarshalledInstance.fullyEquals
is the situation
where a lookup service of interest is newly discovered or rediscovered.
In this situation, the LookupCache
retrieves a snapshot of the services of interest that currently reside in that
lookup service. This is done so that previously undiscovered service references,
as well as new versions of previously discovered service references, can
both be stored, duplicates can be ignored, and clients can be sent the
appropriate notifications. Whenever a reference from the snapshot is a
duplicate of a previously discovered service reference, the
LookupCache
always compares the two references using
MarshalledInstance.fullyEquals
.
This is because the possibility always exists that the references may refer to different
versions of the service.
Another example of a situation where the
LookupCache
employs MarshalledInstance.fullyEquals
is when a TRANSITION_NOMATCH_MATCH
event is received that
contains a reference that is a duplicate of a previously discovered
reference. When such an event is received, the
LookupCache
must allow for the possibility that the reference contained
in the event refers to a different version of the service
than that referenced by the previously discovered service reference.
This is because the event may represent either the second half of a
TRANSITION_MATCH_NOMATCH
/TRANSITION_NOMATCH_MATCH
(remove/add) event pair, or it may be a notification of the initial
registration of a new version of the service with one of the (multiple)
lookup services targeted by the
LookupCache
during the service discovery process.
To understand the last example described above, consider the situation
where a service initially registers with one lookup service and then
registers with a second lookup service. If the same version of the
service is registered with each lookup service, both lookup services
will send the same event -- a TRANSITION_NOMATCH_MATCH
event -- to indicate that a new service has registered with the
associated lookup service. In this case, there is no ambiguity for the
LookupCache
;
the second reference can be safely ignored because the references are
duplicates that refer to the exact same service.
But suppose that prior to registering with the second lookup service,
the service is replaced with a new version. In that case, the second
lookup service will still send a TRANSITION_NOMATCH_MATCH
event, but
if the appropriate action is not taken to determine that the old version
of the service has been replaced, the new version of the service will
be ignored by the LookupCache
.
This is because the two references will have the same
ServiceID
,
and thus the LookupCache
will identify the two references as duplicates; ultimately ignoring the new reference.
Of course, if the service is well-behaved, because the new version of the service will
eventually reregister with the first lookup service, that lookup service
will
eventually send a TRANSITION_MATCH_NOMATCH
event followed by a
TRANSITION_NOMATCH_MATCH
event to indicate that the service
has been
replaced. But the LookupCache
must still identify and handle this situation in order to prevent possible state
corruption, even though that corruption may be only temporary.
Thus, whenever a TRANSITION_NOMATCH_MATCH
event is received
and the
associated service reference is a duplicate of a previously discovered
reference, the LookupCache
will always compare the two references using
MarshalledInstance.fullyEquals
to determine whether or not the references refer to the same version of the service.
When they do reference the same version, the
LookupCache
ignores the duplicate reference; otherwise, the
LookupCache
sends a service-removed event followed by a service-added event to indicate
that the old version of the service has been replaced.
Comparison by equals
The mechanism employed by the LookupCache
to select (from
storage) a given reference for discard is the
equals
method provided by the discovered service itself. The
LookupCache
relies on the provider of each service to override the
equals
method with an implementation that allows
for the identification of the service reference an entity wishes to
be discarded. Although the default implementation of
equals
often times may be sufficient for proper identification, service providers
are still encouraged to provide each service with its own well-defined
implementation of equals
.
In addition to the
equals
method, each service should also
provide a proper implementation of the
hashCode
method.
This is because references to the service may be stored in, or interact with,
container classes (for example,
HashMap
)
where the service's equals
and hashCode
methods may be invoked "under the covers" by the container object with
which the service is interacting. From the point of view of the
ServiceDiscoveryManager
and the LookupCache
,
providing an appropriate implementation for both the
equals
method and the
hashCode
method is a key characteristic of good behavior in a Jini service.
Note 2: The Service Discovery Filtering Mechanism
The specification of the
ServiceItemFilter
interface specifies what it means to filter a service reference selected as a
candidate for discovery. In particular, instances of
ServiceItemFilter
can be defined to perform proxy preparation. Thus, through the client-defined filter
(through the
check
method), clients can request that the
ServiceDiscoveryManager
,
rather than the client itself, perform any desired proxy preparation
as part of the service discovery process.
To understand why this is important, consider what can happen when
the client performs preparation of its discovered proxies outside of the
ServiceDiscoveryManager
.
When proxy preparation is performed outside of the service discovery
manager, the client risks encountering a cycle where a matching service is
discovered by the
ServiceDiscoveryManager
,
is found to be untrusted when the client prepares the proxy, is discarded by the client because it is untrusted, and is then ultimately rediscovered
because it still matches the original discovery criteria. Such a cycle
will generally repeat indefinitely because the service is not likely to
become trusted at any point in the future. Supplying the
ServiceDiscoveryManager
with the means to perform proxy preparation on the client's behalf
provides the client with a mechanism for avoiding such a cycle.
For more information, please refer to the section titled,
SD.5.2 The ServiceItemFilter
Interface, contained in
the
Jini(TM) Service Discovery Utilities Specification.
Note 3: The Service Discovery Event Mechanism
The specification of the
ServiceDiscoveryListener
interface describes how the filtering mechanism and the event mechanism provided by instances of
net.jini.lookup.LookupCache
interact.
Additionally, this specification specifies that instances of the
LookupCache
use the method
net.jini.io.MarshalledInstance.fullyEquals
to determine when a service has changed in some fundamental way (for example,
when a service is replaced with a new version).
For more information, please refer to the section titled,
SD.5.4 The ServiceDiscoveryListener
Interface, contained in the
Jini(TM) Service Discovery Utilities Specification
Bug ID | Description |
4353391 | LookupCache.terminate() can cause an InternalError in the task thread. Prior to fixing this bug, the service discovery manager passed the listener implementation, rather than its stub, to the lookup service proxy's notify method. Under normal circumstances, RMI automatically replaces the implementation with the stub, assuming the implementation is exported. However when the terminate method on a LookupCache is called, the implementation gets unexported and, because of the way the service discovery manager manages its task queue, terminate doesn't manage to do anything about in-progress or pending tasks for the LookupCache, such as calling notify. Because of the change that went into Java 2 SDK, v 1.2.2, if a remote object isn't a stub and isn't an exported implementation, RMI will simply try to marshal it rather than throwing a StubNotFoundException. However, the service discovery manager's listener implementation extends UnicastRemoteObject, which ultimately extends RemoteObject, and RemoteObject has a writeObject method that throws an InternalError (because prior to the 1.2.2 change, this wasn't supposed to be able to happen. |
4367215 | ServiceDiscoveryManager: An ArrayIndexOutOfBoundsException can occur because
totalMatches is used as the number of discovered services to process rather than
items.length. Prior to fixing this bug, the service discovery manager assumed that the value of ServiceMatches.items.length as the number of services returned from a query on a number of services returned from a query on a lookup service. This meant that an ArrayIndexOutOfBoundsException could occur if totalMatches ever exceeded the number of services actually returned (which can happen when more services matching the template are currently registered than the maximum number of services requested. |
4340939 | Two versions of ServiceDiscoveryManager.lookup contain a potential race condition. When one of the blocking lookup methods (that takes a waitDur argument) is called, the method starts tasks to send events to the internal ServiceDiscoveryListener when a matching services appear, and then waits in a loop for notifications to occur. The call to object.notifyAll that interrupts the waiting thread is made in the ServiceDiscoveryListener when remote events arrive signaling state changes related to services of interest. Prior to fixing this bug, the call to notifyAll was not guaranteed to always occur after the waiting thread entered the wait state. If the call to notifyAll happened to occur prior to the wait state being entered, then the lookup method could unnecessarily block for the full amount of time. |
4366369 | The service discovery manager should trap exceptions that occur in the task thread. Prior to fixing this bug, if an exception occurred in one of the tasks being executed by the service discovery manager's task thread, the exception was not caught, the encompassing task thread would die, and no more tasks would be executed, rendering the current instance of the service discovery manager incapable of processing any more tasks. With this bug fix, not only are such exceptions trapped, but if debugging is enabled, the stack trace will be displayed. |