Storing counter inside cassandra collection
Storing counter inside cassandra collection
I want to store aggregation data from my sensor, here's my schema plan for cassandra table
UPDATED
CREATE TABLE every_second_aggregate_signature(
device_id text,
year int,
month int,
day int,
hour int,
minute int,
second int,
signature map<text,counter>,
PRIMARY KEY ((device_id, year, month, day, hour, minute, second))
)WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC, minute DESC, second DESC);
The signature data is in form of increment value and dynamic value, ex.
ts1 - {"4123sad" : 80, "djas10" : 99}
ts2 - {"4123sad" : 83, "djas10" : 103}
ts3 - {"4123sad" : 87, "djas10" : 198, "asfac9" : 281}
ts4 - {"4123sad" : 89, "djas10" : 201, "asfac9" : 540, "asd81" : 12}
The problem is I knew that cassandra didn't support counter inside the collection. Are there any alternative approach or solution to this problem ? Thanks for your help
@ChrisLohfink updated the question, thanks
– Dimas Rizky
yesterday
1 Answer
1
The only alternative approach here is to move the "key" of the map into primary key of the table. Right now you're trying to model like this (as I understood):
create table counters (
ts timestamp primary key,
counters map<text, counter>
);
Then you'll need to change it to following:
create table counters (
ts timestamp,
key text,
counter counter,
primary key (ts, key)
);
And to select all values that should go into map, you simply do
select ts, key, counter from counters where ts = 'some value';
it will return you every pair of key/counter for given ts
in the separate rows, so you'll need to have a code that merge them into map...
ts
It is possible actually, but does that mean I have to store each key as one row ? It's my first time to work with cassandra, does this type of model didn't make the workload high ?
– Dimas Rizky
yesterday
Yes, you need to store every key as separate row. It's the limitation of the
counter
type - you can't have counter & non-counter values inside the same table - all non-counter values should be in the primary key.– Alex Ott
yesterday
counter
Also, if you're going to count something where you need a precision, then the counter type isn't what you need - they are approximate, and could diverge.
– Alex Ott
yesterday
actually I'm not planning on use cassandra aggregate function. I will aggregate the metric and count it inside apache spark. That way spark will update the table each new event is streamed. So, precision is not my #1 concern, performance is.
– Dimas Rizky
4 hours ago
I meant that you can lose some data, as counter type isn’t precise...
– Alex Ott
4 hours ago
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
whats your primary and clustering keys?
– Chris Lohfink
2 days ago