elasticsearch


Search a list of names and categorizing each letter type


I want to index a large list of names using ES.
I want to distinguish between consonants and vowels in each word, and be able to search based on the position of each letter and if it is a consonant or a vowel.
So say the name like:
JOHN
I want to enter this:
CVCC
and when I run the search, JOHN should be in the result set.
Is it possible somehow to index names in elastic search such that I could index and then search them using the tokens C and V for vowel?
So somehow Elasticsearch will have to index the character types for each position for each word, how can this be done?
You can do it with pattern_replace char filters in a custom analyzer. Also, in my solution I have used a sub-field for the custom analyzer, thinking maybe that you will want other kinds of searches on the name field, the consonants-vowels one being only one of them.
DELETE test
PUT test
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "keyword",
"char_filter": [
"replace_filter_lowercase_CONS",
"replace_filter_uppercase_CONS",
"replace_filter_lowercase_VOW",
"replace_filter_uppercase_VOW"
]
}
},
"char_filter": {
"replace_filter_lowercase_CONS": {
"type": "pattern_replace",
"pattern": "[b-df-hj-np-tv-z]{1}",
"replacement": "c"
},
"replace_filter_uppercase_CONS": {
"type": "pattern_replace",
"pattern": "[B-DF-HJ-NP-TV-Z]{1}",
"replacement": "C"
},
"replace_filter_lowercase_VOW": {
"type": "pattern_replace",
"pattern": "[aeiou]{1}",
"replacement": "v"
},
"replace_filter_uppercase_VOW": {
"type": "pattern_replace",
"pattern": "[AEIOU]{1}",
"replacement": "V"
}
}
}
},
"mappings": {
"test": {
"properties": {
"name": {
"type": "text",
"fields": {
"cons_vow": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
}
}
}
POST /test/test/1
{"name":"JOHN"}
POST /test/test/2
{"name":"Andrew"}
POST /test/test/3
{"name":"JOhn DOE"}
GET /test/_search
{
"query": {
"term": {
"name.cons_vow": {
"value": "CVCC"
}
}
}
}

Related Links

Using collectd instead of topbeat in an ELK implementation for monitoring system statistics
how to analyze text in elasticsearch using java api?
how to get total tokens count in documents in elasticsearch
Performing an AND query in elastic search
Best way to store Geo locations details in Cassandra and Indexes in Elastic Search?
Elasticsearch 2.x index mapping _id
Elastic search: query string specify decimal points
Search in ElasticSearch with where condition
Distance from a specific geo position
Elasticsearch remove “one level” from the mapping
Sort documents by size of a field
Find documents per category
How do I make a field have varying type in Elastic Search
How to write data in Elasticsearch from Pyspark?
Implementing Suggestions 'xxx in Category' using elasticsearch
Way to limit number of columns in CSV logstash?

Categories

HOME
events
firebase
adfs3.0
bokeh
d3.js
add-on
devexpress
tabs
cypher
biztalk-2010
swift2
facebook-android-sdk
clone
mithril.js
sympy
mailing-list
msmq
sap-fiori
tiff
django-rq
exe
jqxgrid
cocos2d-android
azure-machine-learning
anova
tableview
quartz.net
prompt
progid
snap-framework
mongodb-3.4
cucumber-junit
ms-media-foundation
payeezy
ftp-server
runtimeexception
ng-repeat
microsoft-ui-automation
praat
system-on-chip
distributed-caching
vungle-ads
jqgrid-asp.net
django-static-precompiler
icecast
sigsegv
ptvs
oid
elastic4s
pjax
e
dmarc
carmen
between
nsdata
oci
unsigned
gnucash
teamviewer
currency-exchange-rates
msgpack
chicagoboss
ora-04091
screen-orientation
postal-code
google-gdk
ampersand
ocmockito
nsmutabledata
subview
incron
pass-by-value
famo.us
iplimage
object-code
loginview
rpg
two.js
refit
meteor-collections
mod-perl
colon
ggts
san
quickgraph
indesign-server
.net-remoting
dibs
proc-open
xmlwriter
icenium
simplecov
instance-variables
telerik-ajax
first-class
platform-independent
firebird1.5
data-dump
startupscript
tablet-pc
j2mepolish
dm
bucket
levels

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App