ENTRIES TAGGED "text processing"
Offline Design, Full Text, Parsing Library, and Node Streams
- Network Connectivity Optional (Luke Wroblewski) — we need progressive enhancement: assume people are offline, then enhance if they are actually online.
- Whoosh — fast, featureful full-text indexing and searching library implemented in pure Python
- Flanker (GitHub) — open source address and MIME parsing library in Python. (via Mailgun Blog)
- Stream Adventure (Github) — interactive exercises to help you understand node streams.
Privacy: Gone in 150ms, Pen-Testing Tablet, Low-Level in Lua, and Metaphor Identification Shootout
- Behind the Banner — visualization of what happens in the 150ms when the cabal of data vultures decide which ad to show you. They pass around your data as enthusiastically as a pipe at a Grateful Dead concert, and you’ve just as much chance of getting it back. (via John Battelle)
- pwnpad — Nexus 7 with Android and Ubuntu, high-gain USB bluetooth, ethernet adapter, and a gorgeous suite of security tools. (via Kyle Young)
- Terra — a simple, statically-typed, compiled language with manual memory management [...] designed from the beginning to interoperate with Lua. Terra functions are first-class Lua values created using the terra keyword. When needed they are JIT-compiled to machine code. (via Hacker News)
- Metaphor Identification in Large Texts Corpora (PLOSone) — The paper presents the most comprehensive study of metaphor identification in terms of scope of metaphorical phrases and annotated corpora size. Algorithms’ performance in identifying linguistic phrases as metaphorical or literal has been compared to human judgment. Overall, the algorithms outperform the state-of-the-art algorithm with 71% precision and 27% averaged improvement in prediction over the base-rate of metaphors in the corpus.