CommunityNews

CommunityNews

Membership Testing for Semantic Regular Expressions

Membership Testing for Semantic Regular Expressions.
SMORE (Chen et al., 2023) recently proposed the concept of semantic regular expressions that extend the classical formalism with a primitive to query external oracles such as databases and large language models (LLMs). Such patterns can be used to identify lines of text containing references to semantic concepts such as cities, celebrities, political entities, etc. The focus in their paper was on automatically synthesizing semantic regular expressions from positive and negative examples. In this paper, we study the membership testing problem:
First, We present a two-pass NFA-based algorithm to determine whether a string $w$ matches a semantic regular expression (SemRE) $r$ in $O(|r|^2 |w|^2 + |r| |w|^3)$ time, assuming the oracle responds to each query in unit time. In common situations, where oracle queries are not nested, we show that this procedure runs in $O(|r|^2 |w|^2)$ time. Experiments with a prototype implementation of this algorithm validate our theoretical analysis, and show that the procedure massively outperforms a dynamic programming-based baseline, and incurs a $\approx 2 \times$ overhead over the time needed for interaction with the oracle.
Next, We establish connections between SemRE membership testing and the triangle finding problem from graph theory, which suggest that developing algorithms which are simultaneously practical and asymptotically faster might be challenging. Furthermore, algorithms for classical regular expressions primarily aim to optimize their time and memory consumption. In contrast, an important consideration in our setting is to minimize the cost of invoking the oracle. We demonstrate an $Ω(|w|^2)$ lower bound on the number of oracle queries necessary to make this determination.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Where Next?

Popular General Dev topics Top

First poster: bot
Hush Keyboards with Hushboard. Yesterday while surfing the ASCII highways of IRC (yes, IRC) a URL linking to a MacOS application scrolle...
New
First poster: malloryerik
GitHub - hlissner/doom-emacs: An Emacs framework for the stubborn martian hacker. An Emacs framework for the stubborn martian hacker - G...
New
CommunityNews
GitHub - livekit/livekit: Scalable, high-performance WebRTC SFU. SDKs in JavaScript, React, React Native, Flutter, Swift, Kotlin, Unity/C...
New
CommunityNews
ABSTRACT In lieu of a traditional , I’ve tried to distill the essence of the talk into a collection of maxims: All programmers are API ...
New
First poster: bot
API Gateway Trends behind Features: Apache APISIX 3.0 vs. Kong 3.0 - API7.ai. By comparing the open-source API Gateway Apache APISIX and...
New
First poster: bot
Raspberry Pi security alarm — the basics. In November last year — I started building a DIY security alarm system, using a Raspberry Pi a...
New
First poster: bot
Rewrite it in Rust by ridiculousfish · Pull Request #9512 · fish-shell/fish-shell. (Sorry for the meme; also this is obligatory.) I thi...
New
First poster: bot
Hector Martin (@marcan@treehouse.systems). Attached: 1 image For those wondering why the hell we need all this safety system stuff for...
New
First poster: dyowee
GitHub - TodePond/DreamBerd: perfect programming language. perfect programming language. Contribute to TodePond/DreamBerd development by...
New
First poster: dyowee
olmOCR is an open-source tool for converting PDFs to text with high accuracy, preserving reading order and supporting tables, equations, ...
New

Other popular topics Top

Devtalk
Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:
New
AstonJ
SpaceVim seems to be gaining in features and popularity and I just wondered how it compares with SpaceMacs in 2020 - anyone have any thou...
New
New
PragmaticBookshelf
Rust is an exciting new programming language combining the power of C with memory safety, fearless concurrency, and productivity boosters...
New
PragmaticBookshelf
Tailwind CSS is an exciting new CSS framework that allows you to design your site by composing simple utility classes to create complex e...
New
Exadra37
Oh just spent so much time on this to discover now that RancherOS is in end of life but Rancher is refusing to mark the Github repo as su...
New
AstonJ
Continuing the discussion from Thinking about learning Crystal, let’s discuss - I was wondering which languages don’t GC - maybe we can c...
New
New
New
AstonJ
If you’re getting errors like this: psql: error: connection to server on socket “/tmp/.s.PGSQL.5432” failed: No such file or directory ...
New