seq-tickets
seq-tickets copied to clipboard
Function(s) for extracting parts from URLs
Issue
It is difficult to work with URLs. It would help if there was support for using capturing groups in regular expressions (#1383), but even then it would still be very complicated.
Idea
Provide functions to work with URLs. Such as:
UrlPart('https://www.example.com:1234/hello/world?alpha=1&beta=2#test', part)
- When part is 'scheme' => 'https'
- When part is 'authority' => 'www.example.com:1234'
- When part is 'host' => 'www.example.com'
- When part is 'port' => 1234
- When part is 'path' => '/hello/world'
- When part is 'query' => 'alpha=1&beta=2'
- When part is 'fragment' => 'test'
See: https://en.wikipedia.org/wiki/URL for this terminology.
There are many edge cases in parsing URLs. Suggest cross-testing against a commonly used library.
Would love to see something like this implemented, but more general so any regular expression with capturing groups could be used.
Logs are stored in Postgres database with our current homegrown logging system. Regular expressions with capture groups can be used when selecting information to display. (documentation)
I would like to use something like this in Seq to show more concise graphs.
Loggers are formatted like namespaces (i.e., "a.b.c.d.f") and I only want "d.f" on the charts so they are readable.
select
-- substring(Logger from '(\w+\.\w+)$') as DisplayLogger, -- Postgres syntax
count(*) as EventCount
from stream
where @Timestamp > Now() - 2h
group by Logger
order by EventCount desc
limit 25
@schallm for your specific example I think you might be able to use indexOf
and substring
Can’t test right now, but interesting you can use a regex in the indexof
function. I’ll look into that in the morning. I tried testing with substring
in my query earlier and I get an error that substring
can not be used as an aggregate function in a group by query.
That's right, and there the semantics are the same as postgres. If you group the data then the select clause has to be an aggregate function that can reduce a set of rows (the group) to a single row. This query makes no sense:
You can move the Substring
operation to the group by
clause and it will appear in the output. Then choose your aggregate function.
All of the string matching functions support regular expressions, and the ci
modifier.
Thanks so much for your help! This works great! However, I have a few questions/thoughts. 😄
semantics are the same as postgres
While you can group by a substring in Postgres, you have to select it for the columns to be in the results then group by the alias.
Postgres:
select
substring(Logger from '(\w+\.\w+)$') as DisplayLogger,
count(*) as EventCount
from Events
group by DisplayLogger
order by EventCount desc
Working Seq query:
select count(*) as EventCount
from stream
where @Timestamp > Now() - 2h
group by substring(Logger, indexof(Logger, /(\w+\.)?\w+$/)) as DisplayLogger
order by EventCount desc
For others (and future me), note that you can rename the column in the group by
statement as shown above.
This works great as substring doesn't need to know the length of the match as I want the remainder of the string.
It would still be nice to have substring
work directly with regular expressions and a capture group so I could get text from the middle of a string. For example, I want to get just the milliseconds out of the following string:
The process (processId: 23) took 6.41ms to complete.
Postgres:
substring(Message from 'took ([\d\.]+)ms') -- returns '6.41'
Could something like the following syntax be added? Or does functionality similar to this already exist?
substring(Message, /took ([\d\.]+)ms/) -- returns '6.41'
The answer could be "use structured logging". However, I think being able to extract data out of strings would always be useful.