snabelen.no er en av mange uavhengige Mastodon-servere du kan bruke for å delta i det desentraliserte sosiale nettet.
Ein norsk heimstad for den desentraliserte mikroblogge-plattformen.

Administrert av:

Serverstatistikk:

363
aktive brukere

#regex

ett innlegg1 deltaker0 innlegg i dag

Adding lookbehinds to rust-lang/regex, systemf.epfl.ch/blog/rust-rege.

The feature of lookbehinds is very often absent in linear regex engines. These researchers bring them in the `regex` crate. The benchmarks show a reasonable and usable performance making it ready for real-world applications.

The article gives all pointers to the research article and the patches for `regex` (on github.com).

The prevention of unnecessary lookbehind scanning till the end of the haystack is neat!

The words 'Systems and Formalisms Lab' on three lines next to a red rectangle.
SYSTEMF @ EPFL · Adding lookbehinds to rust-lang/regex – SYSTEMF @ EPFL
Mer fra SYSTEMF lab

Apropos of last weekends #emacs hacking. How many times would you ask an #llm to generate a #regex for you before you give up and just use a #rx form?

Loving the power of occur in #Emacs, specifically multi-occur-in-matching-buffers in my case, for finding and listing in a single buffer all #regex matches from regex-filtered buffers. Searching across .csv terminology files this way gives me a buffer of search results, each line of which takes me to the line of the source file where the occurrence appears. Extremely useful in #translation work for searching across terminology dictionaries I've created in the past. Discovered via Mickey Petersen's Mastering Emacs. masteringemacs.org/article/sea

@jwildeboer

My guess is #ReGex:

In regular expressions, the underscore counts as a "word character", whilst dashes, commas, dots, whitespaces etc. count as "non word characters".

This is a hugely important thing: All relevant APIs rely heavily on RegExes.

I'm afraid, your initiative won't succeed for that reason:

All #ActivityPub software would have to be revised and patched to solve a niche annoyance.

My bet: It ain't gonna happen.

@Gargron @evan

Replied in thread

@timbray my first job out of uni was parsing hand coded html with perl.
It has proven, this far, to be both impossible to do perfectly and a fantastic source of job security.

(Yes I use Python NLP pipelines and such these days but still...)

@toddalstrom oh je découvre qu'il est possible d'ajouter des expression "Regex" dans les filtres Mastodon !?

L'expression pour Thread me sera sans doute moins utile maintenant qu'il est bien largement défédéré, mais ça ouvre le champ des possible.

EDIT : Zut voilà que je déchante en apprenant que ça l'était jusqu'en 2018 dans l'implémentation officielle ...

Replied in thread

@PragmaticAndy Ive felt the same myself - pedantry for writing comms helped my coding skills alot.

However, while Python is useful in order to have an experiment it provides a ceiling.

Would this be the case for using #regex for instance?

Or multi dimensional arrays in Gawk?

My recent obsession with the #OscarMamen travel logs of his journey(s) to & time in #Mongolia #BogdKhanate, I wrangled data from a photo archive database.
For unknown reasons, the database doesn't have a field for "date". All date info is stored alongside content descriptions of the photos and their location in the physical archive in the free-text field "motive description".
To work through 7.500 photos + matching them to log entries, I re-taught myself #regex & @OpenRefine
Can recommend!

I still can't believe that most programming systems we use today are preoccupied with numbers. AFAIK, half of (R5RS?) #Scheme standard is numbers and operations on them. Same for #C, #CommonLisp, #Java—ten different types of numbers and huge libraries for them.

Humans think in images and words. Structured text-oriented languages feel like a much better fit for everyone not corrupted by C. Yet we have little to no popular attempts in that space. Structured Regular Expressions didn't catch up; #ed1 and #awk are considered mere #regex automation tools. Modal and the term rewriting systems have their Merveilles Town, but not much beyond. sh/#bash and the like are quite successful, but aren't considered real programming languages either.

Why.

Do I have #regex experts among my followers: echo "123.4506000" | sed -E 's/(\.[0-9]*[1-9])?0+$//; s/\.$//' is intended to remove trailing 0s when its a number with a decimal point. But when there are no cifers behind the decimal point other than 0s, the whole number shall be stripped of the point and the 0s. What are I am doing wrong? Sharing appreciated.