Grouping nodes of the same label by parameter


Grouping nodes of the same label by parameter



I'm new to graph databases, so apologies if I get some of the correct terminology wrong.



I'm using Neo4j and have a dataset made up of - mostly - one kind of node. These nodes have a variety of parameters and relationships between each other and the other labeled nodes in the graph.



To give a simple example of what I'm trying to achieve, let's assume I have a label of "Person". Each Person has a parameter named "gender", which will have a value of "male" or "female". What's the best practice if I want to run a query that will return all males in one variable, and all females in the other? Should they be separate labels? That seems like a bad idea given the parameters on each are identical.




2 Answers
2



Since the neo4j DB maintains label count statistics, using Male and Female labels will get you those counts immediately -- without even needing to do any node queries.


Male


Female



For example, this query gets the number of Male nodes from the statistics:


Male


MATCH (:Male)
RETURN COUNT(*) AS males



However, the current Cypher planner seems to refuse to use the statistics a second time in the same query (based on my PROFILE runs), so the following query will actually scan the DB for Female nodes. Hopefully this can be improved in future Cypher planners.


Female


MATCH (m:Male)
WITH COUNT(m) AS males
MATCH (f:Female)
RETURN males, COUNT(f) AS females



[UPDATE 1]



However, as suggested by @InverseFalcon, using UNION ALL does cause the statistics to be used every time:


UNION ALL


MATCH (m:Male) RETURN {male: COUNT(m)} AS counts
UNION ALL
MATCH (f:Female) RETURN {female: COUNT(f)} AS counts



[UPDATE 2]



If you want to get the actual nodes instead of the counts, then there are 2 answers with about the same performance (as shown by their PROFILEs).


PROFILE



You can use Male and Female labels:


Male


Female


MATCH (m:Male)
WITH COLLECT(m) AS males
MATCH (f:Female)
RETURN males, COLLECT(f) AS females



You can create an index on :Person(gender):


:Person(gender)


MATCH (m:Person {gender: 'male'})
WITH COLLECT(m) AS males
MATCH (f:Person {gender: 'female'})
RETURN males, COLLECT(f) AS females



However, this approach would require more storage, since you'd have to store the gender property with every node.





In the meantime, here's a neat cypher trick which will utilize the counts store using UNION ALL: MATCH (m:Male) RETURN {male:COUNT(m)} AS counts UNION ALL MATCH (f:Female) RETURN {female:COUNT(f)} AS counts
– InverseFalcon
2 days ago


MATCH (m:Male) RETURN {male:COUNT(m)} AS counts UNION ALL MATCH (f:Female) RETURN {female:COUNT(f)} AS counts





I'm looking for the actual nodes to be returned rather than only their counts, but thank you all the same.
– user1381745
19 hours ago





See my updated answer.
– cybersam
10 hours ago



You can have one label Person for both with gender property.


Person


gender



The query below will return a List with two elements, each a List as well. The first element is for males and the second is females


List


List


males


females


MATCH (male {gender:'male'})
WITH COLLECT(male) AS maleList
MATCH (female {gender:'female'})
RETURN maleList, COLLECT(female)



I'm not sure if it is the best query. But it should return you what you need






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV