Immutable hashcode for project and module objects

In our IntelliJ plugin, we have a long-term data structure in the form of a hashset. The hashset stores objects whose hashcode should be determined by contained project and/or module objects.

The problem is that I was not able to find values that can be used to create an immutable hashcode, i.e. values that won't change over the lifecycle of a project/module object. The best candidates I could find were the name and the base path, which both can change on runtime.

I also considered using the default hashCode of the objects. Even though the default implementation of both classes don't implement the hashCode method, the objects returned by the API seem to be singleton. But I could not find any reference to this fact in the documentation and didn't want to base the logic solely on a behavior observed during limited testing.

Are module and project objects guaranteed to be singleton across their lifecycle?

If not, can you recommend a way of obtaining immutable hash-codes for such objects? (I also considered re-adding the objects when values change, but had problems reacting to certain changes; e.g. I could not figure out how to synchronously react to project renamings.)

7 comments
Comment actions Permalink

Could you please give more details what exactly this persistent data structure is used for? Is it not possible to calculate this data on-the-fly?

0
Comment actions Permalink

I am working on an IntelliJ implementation of Saros, an IDE plugin allowing for remote collaborative programming.

This is currently done by listening for actions that are of interest to us (like document modification, vfs modifications, text highlights, etc.) on an application scale and then filtering for the ones we want to 'share' (synchronize with other users).

Sharing is currently done on a module basis, meaning you can share every (non-excluded) resources of a module. The shared module is held in a hashset. Whenever we are notified about an action, we check if its resource belongs to a shared module.

This explains the need for an immutable hashcode for modules, as we want to be able to correctly filter actions across the IDE lifecycle.

Furthermore, we want to distinguish between modules with the same name that are opened in different projects. If this can be done solely through the module object, an immutable hashcode for project objects is not of interest for us.

 

I hope this more detailed description helps to convey the issue.

In the context of the issue, I am not sure what you mean by 'calculate this data on-the-fly'. If this does not stem from a different understanding of the problem, please elaborate.

0
Comment actions Permalink

Thanks for the detailed explanation.

You can use ModulePointer (via ModulePointerManager) to store a reference to a module. A module _always_ belongs to a certain Project, see com.intellij.openapi.module.Module#getProject.

With "on-the-fly" calculation I meant to calculate whether specific Module is still valid/present in current Project and/or whether specific file belongs to specific Module when this information is actually needed (instead of relying on possibly outdated cached information).

0
Comment actions Permalink

> You can use ModulePointer (via ModulePointerManager) to store a reference to a module. A module _always_ belongs to a certain Project, see com.intellij.openapi.module.Module#getProject.

As a side-question: Is it preferable to store the ModulePointer instead of the Module directly (for longer-term storage)? If so, why?

 

> A module _always_ belongs to a certain Project, see com.intellij.openapi.module.Module#getProject.

I just mentioned the project explicitly as I was previously using the module name to identify modules. But, when allowing multiple open projects, module names are no longer sufficient to uniquely identify modules, so in this case I would have to use other characteristics (like the module file path) or also use project characteristics to uniquely identify the project-module-combination (project name + module name are not sufficient either).

 

> With "on-the-fly" calculation I meant to calculate whether specific Module is still valid/present in current Project and/or whether specific file belongs to specific Module when this information is actually needed (instead of relying on possibly outdated cached information).

I am not sure whether this really helps me. Can I use the hascode of the ModulePointer?

Yes, I know that I can use the API to check whether a resource belongs to a module and that I can also get the module for any resource. Sadly, this doesn't really help me in this case. Our plugin has multiple implementations for different IDEs (currently Eclipse and IntelliJ). To accommodate this, we have a "core" implementation containing central logic for the plugin that is shared by all implementations.
The comparison (and also the hashset of shared resources (modules)) is located in this central part of the logic. Currently, for any resource activity (independently of whether it was generated locally or by a remote participants), we pass it to a central component which is supposed to decide whether it is shared. This is currently done by calculating the resource root (module in this case) and checking if it is contained in the held hashset. (Furthermore, we are indirectly using the module reference as a key in several hashmaps, which also requires a constant/immutable hashcode value). This is why I would like an immutable hashcode.

There are other alternatives (like, as you mentioned, always dynamically checking whether the resource belongs to one of the modules), but they would require a lot of restructuring of code shared by different implementations. Could you give me an answer/indication on my initial question (whether module objects are guaranteed to be singleton across their lifecycle, so the default hashcode can be used, or there is a different way to generate an immutable hashcode for module objects)?

If there is no feasible solution to generate such a hashcode, I will have to bite the bullet and restructure the code. But otherwise, I would like to avoid it. I hope you can see where I'm coming from.

0
Comment actions Permalink

Hello Tobias,

ModulePointer is used to provide a reliable reference to a module by name. If you have a part of configuration which refers to a module by name, it may happen that module with such name doesn't exist (yet), so you won't be able to get a Module instance. But if you store a module name instead of Module instance, it will be broken if a module is renamed. In such cases ModulePointer will help you, you can create it from a module name and it'll correctly process renames. However if you don't need to refer to non-existing modules where is no need to use ModulePointer, you can just use Module instance instead. Also it makes no sense to use hashcode of ModulePointer, because there may be several ModulePointers pointing to the same Module.

Regarding your original quiestion: Module and Project implementations don't override the default (identity based) hashCode and equals, and their instances are valid until the project is opened in the IDE. So you can use them as keys. However note that hashcodes of these objects will change after you restart the IDE. Currently in order to get an ID for a module which survives restarts we use a trinity of Module::getName, Project::getName and Project::getLocationHash (the latter returns a hash code of path to the project directory). We use such IDs to save module-specific caches, for example, so it isn't a problem that they won't survive if a module is renamed.

0
Comment actions Permalink

Thanks to both of you for your continued patience and quick responses to my questions. Your help is really appreciated. :)

> Module and Project implementations don't override the default (identity based) hashCode and equals, and their instances are valid until the project is opened in the IDE.

Do you mean "until the project is no longer open in the IDE"? Otherwise, I don't quite understand what you are trying to tell me.

 

Regarding the topic of module references across different IDE/application lifecycles: This is currently not something we have to deal with as our sessions are completely application lifecycle bound and we don't have any configurations concerning specific modules/project. But still, nice to know for possible future requirements.

0
Comment actions Permalink

> Do you mean "until the project is no longer open in the IDE"?

Yes, this is that I meant.

0

Please sign in to leave a comment.