Python reduce dictionary memory by bits as keys -
i have dna data 11 million entries. count occurrences of patterns on data "atcg". called kmer counting meaning counting k length substring occurrences. have no problem kmer algorithm. stored counted values in dictionary. keys "aattaccacttcatgtattaaaagatactaac". storing keys strings , dictionary memory increases , gives me memory error. try find way reduce dictionary memory. resources suggests keys can hold ints or bits instead of strings. way store keys efficient , how can that?
related code :
dd = {} open(filename,'r') f: line in f: line2 = line.rstrip('\n') n = len(line2) in range(n-k+1): pattern = line2[i:i+k] if pattern in dd: dd[pattern] = dd[pattern] + 1 else: dd[pattern] = 1
edited: dictionary printed dictionary;
('cccccccccccccccccccccccccccccc', 105) ('gaaaaggatcaagagcccatttatgccata', 42) ('gagaaagatgaaaaggatcaagagcccatt', 42)
wiki
Comments
Post a Comment