Python reduce dictionary memory by bits as keys -




i have dna data 11 million entries. count occurrences of patterns on data "atcg". called kmer counting meaning counting k length substring occurrences. have no problem kmer algorithm. stored counted values in dictionary. keys "aattaccacttcatgtattaaaagatactaac". storing keys strings , dictionary memory increases , gives me memory error. try find way reduce dictionary memory. resources suggests keys can hold ints or bits instead of strings. way store keys efficient , how can that?

related code :

dd = {} open(filename,'r') f:     line in f:             line2 = line.rstrip('\n')             n = len(line2)             in range(n-k+1):                 pattern = line2[i:i+k]                 if pattern in dd:                      dd[pattern] = dd[pattern] + 1                 else:                      dd[pattern] = 1 

edited: dictionary printed dictionary;

('cccccccccccccccccccccccccccccc', 105) ('gaaaaggatcaagagcccatttatgccata', 42) ('gagaaagatgaaaaggatcaagagcccatt', 42) 





wiki

Comments

Popular posts from this blog

Asterisk AGI Python Script to Dialplan does not work -

python - Read npy file directly from S3 StreamingBody -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -