machine learning - Why does MITIE get stuck on segment classifier? -




i'm building model using mitie training dataset of 1,400 sentences, between 3-10 words long, paired around 120 intents. model training stuck @ part ii: train segment classifier. i've let run 14 hours before terminating.

my machines has 2.4 ghz intel core i7 , 8 gb 1600 mhz ddr3 , segment classifier uses available memory (around 7gb), relying on compressed memory, , @ end of last session activity monitor showed 32gb used , 27gb compressed. , segment classifier has never completed.

my current output below:

info:rasa_nlu.model:starting train component nlp_mitie info:rasa_nlu.model:finished training component. info:rasa_nlu.model:starting train component tokenizer_mitie info:rasa_nlu.model:finished training component. info:rasa_nlu.model:starting train component ner_mitie training recognize 20 labels: 'pet', 'room_number', 'broken_things', '@sys.ignore', 'climate', 'facility', 'gym', 'medicine', 'item', 'exercise_equipment ', 'service', 'number', 'electronic_device', 'charger', 'toiletries', 'time', 'date', 'facility_hours', 'cost_inquiry', 'tv channel' part i: train segmenter words in dictionary: 200000 num features: 271  training c:           20 epsilon:     0.01 num threads: 1 cache size:  5 max iterations: 2000 loss per missed segment:  3 c: 20   loss: 3         0.669591 c: 35   loss: 3         0.690058 c: 20   loss: 4.5       0.701754 c: 5   loss: 3  0.616959 c: 20   loss: 1.5       0.634503 c: 28.3003   loss: 5.74942      0.71345 c: 25.9529   loss: 5.72171      0.707602 c: 27.7407   loss: 5.97907      0.707602 c: 30.2561   loss: 5.61669      0.701754 c: 27.747   loss: 5.66612       0.710526 c: 28.9754   loss: 5.82319      0.707602 best c: 28.3003 best loss: 5.74942 num feats in chunker model: 4095 train: precision, recall, f1-score: 0.805851 0.885965 0.844011 part i: elapsed time: 180 seconds.  part ii: train segment classifier training num training samples: 415 

i understand issue caused redundant labels (as explained here); however, of labels unique. understanding training shouldn't take long or use memory. i've seen others posting similar issues no solution provided yet. causing high memory usage , insane training time? how fixed?





wiki

Comments

Popular posts from this blog

Asterisk AGI Python Script to Dialplan does not work -

python - Read npy file directly from S3 StreamingBody -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -