Snippets

Jump to language: Java, Javascript

Here are a collection of code snippets which I often find use for, all kept together in one handy place.


Java

Extract n-grams from text using Lucene analyzer

private final static Analyzer analyzer = new Analyzer() {
  @Override
  protected TokenStreamComponents createComponents(final String fieldName, final Reader reader) {
    Version matchVersion = Version.LUCENE_CURRENT;
    final Tokenizer source = new StandardTokenizer(matchVersion, reader);
    TokenStream result = new StandardFilter(matchVersion, source);
    // before this is StandardAnalyzer
    result = new EnglishPossessiveFilter(matchVersion, result);
    result = new LowerCaseFilter(matchVersion, result);
    result = new StopFilter(matchVersion, result,
      EnglishAnalyzer.getDefaultStopSet());
    result = new PorterStemFilter(result);
    // before this is EnglishAnalyzer
    result = new ShingleFilter(result, 2, 3); // 3-grams
    return new TokenStreamComponents(source, result);
  }
};

private static List<String> getNGrams(String text) {
  List<String> ngrams = new ArrayList<String>();
  final TokenStream stream = analyzer.tokenStream(null, new StringReader(text));
  stream.reset();
  while (stream.incrementToken()) {
    final String tok = stream.getAttribute(CharTermAttribute.class).toString();
    ngrams.add(tok);
  }
  stream.close();
  return ngrams;
}

Sorting a map based upon the values (descending) – Java 8

public static <K, V extends Comparable<? super V>> Map<K, V> sortByValue(Map<K, V> map) {
  return map.entrySet()
    .stream()
    .sorted(Map.Entry.comparingByValue(Collections.reverseOrder()))
    .collect(Collectors.toMap(
      Map.Entry::getKey,
      Map.Entry::getValue,
      (e1, e2) -> e1,
      LinkedHashMap::new
    ));
}

Javascript