Storing counter inside cassandra collection


Storing counter inside cassandra collection



I want to store aggregation data from my sensor, here's my schema plan for cassandra table



UPDATED


CREATE TABLE every_second_aggregate_signature(
device_id text,
year int,
month int,
day int,
hour int,
minute int,
second int,
signature map<text,counter>,
PRIMARY KEY ((device_id, year, month, day, hour, minute, second))
)WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC, minute DESC, second DESC);



The signature data is in form of increment value and dynamic value, ex.


ts1 - {"4123sad" : 80, "djas10" : 99}
ts2 - {"4123sad" : 83, "djas10" : 103}
ts3 - {"4123sad" : 87, "djas10" : 198, "asfac9" : 281}
ts4 - {"4123sad" : 89, "djas10" : 201, "asfac9" : 540, "asd81" : 12}



The problem is I knew that cassandra didn't support counter inside the collection. Are there any alternative approach or solution to this problem ? Thanks for your help





whats your primary and clustering keys?
– Chris Lohfink
2 days ago





@ChrisLohfink updated the question, thanks
– Dimas Rizky
yesterday




1 Answer
1



The only alternative approach here is to move the "key" of the map into primary key of the table. Right now you're trying to model like this (as I understood):


create table counters (
ts timestamp primary key,
counters map<text, counter>
);



Then you'll need to change it to following:


create table counters (
ts timestamp,
key text,
counter counter,
primary key (ts, key)
);



And to select all values that should go into map, you simply do


select ts, key, counter from counters where ts = 'some value';



it will return you every pair of key/counter for given ts in the separate rows, so you'll need to have a code that merge them into map...


ts





It is possible actually, but does that mean I have to store each key as one row ? It's my first time to work with cassandra, does this type of model didn't make the workload high ?
– Dimas Rizky
yesterday






Yes, you need to store every key as separate row. It's the limitation of the counter type - you can't have counter & non-counter values inside the same table - all non-counter values should be in the primary key.
– Alex Ott
yesterday


counter





Also, if you're going to count something where you need a precision, then the counter type isn't what you need - they are approximate, and could diverge.
– Alex Ott
yesterday





actually I'm not planning on use cassandra aggregate function. I will aggregate the metric and count it inside apache spark. That way spark will update the table each new event is streamed. So, precision is not my #1 concern, performance is.
– Dimas Rizky
4 hours ago






I meant that you can lose some data, as counter type isn’t precise...
– Alex Ott
4 hours ago






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV