Programma
Login

RegEx Strikes Back: Regular Expressions for Text Mining

A short time ago in a galaxy not so far away a regular expression was taking 5 days to run. In this talk you will learn why regular expressions can be slow, how to make them fast using a trie regex data structure and the many uses a good old regular expression can have.

Abstract

Regular Expressions have a bad reputation, and they are slow (or so they say) for text mining tasks. In this talk you’ll learn why regex can be slow and how to use a Trie Regex to craft blazingly fast regular expressions with no effort. How regular expressions integrate smoothly with many libraries (pandas, spacy, etc) and how to use the regex module for common text cleaning tasks such as: prefix finding, fuzzy matching and many more.

Slides: https://speakerdeck.com/mesejo/pycon-italia-regex-strikes-back

Speaker
Daniel Mesejo
Argomento
Python & Friends
Livello audience
Intermediate
Lingua
Inglese
Durata
30 minuti
Speaker name:
Daniel Mesejo
Torna al programma
      Powered by Vercel Logo