Grouping nodes of the same label by parameter
Grouping nodes of the same label by parameter
I'm new to graph databases, so apologies if I get some of the correct terminology wrong.
I'm using Neo4j and have a dataset made up of - mostly - one kind of node. These nodes have a variety of parameters and relationships between each other and the other labeled nodes in the graph.
To give a simple example of what I'm trying to achieve, let's assume I have a label of "Person". Each Person has a parameter named "gender", which will have a value of "male" or "female". What's the best practice if I want to run a query that will return all males in one variable, and all females in the other? Should they be separate labels? That seems like a bad idea given the parameters on each are identical.
2 Answers
2
Since the neo4j DB maintains label count statistics, using Male
and Female
labels will get you those counts immediately -- without even needing to do any node queries.
Male
Female
For example, this query gets the number of Male
nodes from the statistics:
Male
MATCH (:Male)
RETURN COUNT(*) AS males
However, the current Cypher planner seems to refuse to use the statistics a second time in the same query (based on my PROFILE runs), so the following query will actually scan the DB for Female
nodes. Hopefully this can be improved in future Cypher planners.
Female
MATCH (m:Male)
WITH COUNT(m) AS males
MATCH (f:Female)
RETURN males, COUNT(f) AS females
[UPDATE 1]
However, as suggested by @InverseFalcon, using UNION ALL
does cause the statistics to be used every time:
UNION ALL
MATCH (m:Male) RETURN {male: COUNT(m)} AS counts
UNION ALL
MATCH (f:Female) RETURN {female: COUNT(f)} AS counts
[UPDATE 2]
If you want to get the actual nodes instead of the counts, then there are 2 answers with about the same performance (as shown by their PROFILE
s).
PROFILE
You can use Male
and Female
labels:
Male
Female
MATCH (m:Male)
WITH COLLECT(m) AS males
MATCH (f:Female)
RETURN males, COLLECT(f) AS females
You can create an index on :Person(gender)
:
:Person(gender)
MATCH (m:Person {gender: 'male'})
WITH COLLECT(m) AS males
MATCH (f:Person {gender: 'female'})
RETURN males, COLLECT(f) AS females
However, this approach would require more storage, since you'd have to store the gender property with every node.
MATCH (m:Male) RETURN {male:COUNT(m)} AS counts UNION ALL MATCH (f:Female) RETURN {female:COUNT(f)} AS counts
I'm looking for the actual nodes to be returned rather than only their counts, but thank you all the same.
– user1381745
19 hours ago
See my updated answer.
– cybersam
10 hours ago
You can have one label Person
for both with gender
property.
Person
gender
The query below will return a List
with two elements, each a List
as well. The first element is for males
and the second is females
List
List
males
females
MATCH (male {gender:'male'})
WITH COLLECT(male) AS maleList
MATCH (female {gender:'female'})
RETURN maleList, COLLECT(female)
I'm not sure if it is the best query. But it should return you what you need
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
In the meantime, here's a neat cypher trick which will utilize the counts store using UNION ALL:
MATCH (m:Male) RETURN {male:COUNT(m)} AS counts UNION ALL MATCH (f:Female) RETURN {female:COUNT(f)} AS counts
– InverseFalcon
2 days ago