Why cosine_similarity of pretrained fasttex model is high between two sentents are not relative at all?
I am wondering to know why pre-trained 'fasttext model' with wiki(Korean) seems not to work well! :( model = fasttext.load_model("./fasttext/wiki.ko.bin") model.cosine_similarity("테스트 테스트 이건 테스트 문장", "지금 아무 관계 없는 글 정말로 정말로") (in english) model.cosine_similarity("test test this is test sentence", "now not all relative docs really really ") 0.99....?? Those sentence is not at all relative as meaning. Therefore I think that cosine-similarity must be lower. However It was 0.997383... Is it impossive to compare lone sentents with fasttext? So Is it only way to use doc2vec?
Which 'fasttext' code package are you using? Are you sure its cosine_similarity() is designed to take such raw strings, and automatically tokenize/combine the words of each example to give sentence-level similarities? (Is that capability implied by its documentation or illustrative examples? Or perhaps does it expected pre-tokenized lists of words?)
Custom word weights for sentences when calling h2o transform and word2vec, instead of straight AVERAGE of words
Where can I find pre trained word embeddings (English) in word2vec format of 50 dimensions?
Word2Vec Output Vectors
Does word2vec supports multiple languages?
Using ConceptNet5 API to calculate the similarity between texts
Phrase Similarity Score Calculation and Skill-set Extraction from Job Description
How to produce n-gram word-class language model by word2vec?
how can i get cosign diatance between two words in Deeplearning4j - Word2vec
Some problems in the word2vec. The value range of array vocab_hash is larger than vocab's index range?
Paragraph Vector construction and training
CBOW (Continuous Bag of Word) Understandable Code
word2vec specify own training pair for cbow model
Replacing vec.bin with Google News model
word2vec implementation addresing male/female and singular/plural issues
how can I make use of the word2vec pretrained vectors?
word2vec gives vectors of very few words in a text.Why?