DIYbanter

DIYbanter (https://www.diybanter.com/)
-   Metalworking (https://www.diybanter.com/metalworking/)
-   -   Need regex code for counting newsgroups (https://www.diybanter.com/metalworking/361128-need-regex-code-counting-newsgroups.html)

Joe gwinn August 17th 13 05:51 PM

Need regex code for counting newsgroups
 
There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?

Thanks,

Joe Gwinn

Pete Keillor[_2_] August 17th 13 06:56 PM

Need regex code for counting newsgroups
 
On Sat, 17 Aug 2013 12:51:54 -0400, Joe Gwinn
wrote:

There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?

Thanks,

Joe Gwinn


You can set that in Agent 6.0 in folder properties. Today, I've been
using the Message-ID field for the new flock of bozos. When I see a
message id containing "theremailer" or "dont-email.me", it's a pretty
good bet for the bin.

Pete Keillor

Joe gwinn August 17th 13 10:58 PM

Need regex code for counting newsgroups
 
In article , Pete Keillor
wrote:

On Sat, 17 Aug 2013 12:51:54 -0400, Joe Gwinn
wrote:

There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?

Thanks,

Joe Gwinn


You can set that in Agent 6.0 in folder properties. Today, I've been
using the Message-ID field for the new flock of bozos. When I see a
message id containing "theremailer" or "dont-email.me", it's a pretty
good bet for the bin.


Except that I cannot use Agent, because there is no Mac version.

I am using Thoth under MacOS, and Thoth follows Perl's version of regex.

Joe Gwinn

William Bagwell August 18th 13 01:13 AM

Need regex code for counting newsgroups
 
On Sat, 17 Aug 2013 17:58:13 -0400, Joe Gwinn wrote:

Except that I cannot use Agent, because there is no Mac version.


Runs great under Crossover Office on Linux, others have reported the MAC
version is just as good. Turning on a signature I normally use only in the
Forte Agent beta group...
--
William
Mandriva 2008.1 Linux, KDE, Cross Over Office 11.3.1, Athlon 64 X2 Dual-Core
4600+ (2.4GHz), 2GB RAM, ext3 file system, Gigabyte GA-M61PME-S2P, GeForce 6100
(built in), NVIDIA-x86-100.14.11 accelerated driver, 1280x1024x60Hz

Larry Jaques[_4_] August 18th 13 03:03 AM

Need regex code for counting newsgroups
 
On Sat, 17 Aug 2013 12:56:55 -0500, Pete Keillor
wrote:

On Sat, 17 Aug 2013 12:51:54 -0400, Joe Gwinn
wrote:

There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?

Thanks,

Joe Gwinn


You can set that in Agent 6.0 in folder properties. Today, I've been
using the Message-ID field for the new flock of bozos. When I see a
message id containing "theremailer" or "dont-email.me", it's a pretty
good bet for the bin.


How are you doing that, Pete? I only see a Message-ID field in
filtering emails, not Usenet posts.

--
Truth loves to go naked.
--Dr. Thomas Fuller, Gnomologia, 1732

DoN. Nichols[_2_] August 18th 13 05:01 AM

Need regex code for counting newsgroups
 
On 2013-08-17, Joe Gwinn wrote:
There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?


The way I do it is based on the commas in the "Newgsgroups: "
header.

Two newgroups .*,.*
Three newsgroups .*,.*,.*
Four newsgroups .*,.*,.*,.*

Whe ".*" means any number of any characters.
',' means itself -- a plain old comma.

Exactly how you tell your newsreader to use those varies. I
would do something like "-10" for each one of those, so two newsgroups
would be -10, three would be -20, four (or more) would be -30, and set
the auto-kill threshold to -15. (Three is too many in cross-posting.) I
also use + scores on the "Subject: " header for the very few things
which I want to see which are cross-posted, such as the "What Is It"
weekly puzzle posting thread.

Enjoy,
DoN.

--
Remove oil spill source from e-mail
Email: | (KV4PH) Voice (all times): (703) 938-4564
(too) near Washington D.C. | http://www.d-and-d.com/dnichols/DoN.html
--- Black Holes are where God is dividing by zero ---

James Waldby[_3_] August 18th 13 08:11 AM

Need regex code for counting newsgroups
 
On Sat, 17 Aug 2013 12:51:54 -0400, Joe Gwinn wrote:

There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?


The third link in google for terms
newsgroup filter crosspost count
looks like an r.c.m thread from 31 Aug 2011,
subject = "Agent kill filter help please". See eg
https://groups.google.com/forum/#!topic/rec.crafts.metalworking/P_jb3KpxUaU

You could also add a term for your specific newsreader.

--
jiw

Pete Keillor[_2_] August 18th 13 12:37 PM

Need regex code for counting newsgroups
 
On Sat, 17 Aug 2013 19:03:28 -0700, Larry Jaques
wrote:

On Sat, 17 Aug 2013 12:56:55 -0500, Pete Keillor
wrote:

On Sat, 17 Aug 2013 12:51:54 -0400, Joe Gwinn
wrote:

There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?

Thanks,

Joe Gwinn


You can set that in Agent 6.0 in folder properties. Today, I've been
using the Message-ID field for the new flock of bozos. When I see a
message id containing "theremailer" or "dont-email.me", it's a pretty
good bet for the bin.


How are you doing that, Pete? I only see a Message-ID field in
filtering emails, not Usenet posts.


H key to show the headers, then make a new filter with the message id
in the filter expression like below.

Message-ID: {googlegroups\.com}

The "\" is needed in front of the "." in the regular expression
language so that it isn't interpreted as a wildcard.

Pete Keillor

Larry Jaques[_4_] August 18th 13 03:25 PM

Need regex code for counting newsgroups
 
On Sun, 18 Aug 2013 06:37:29 -0500, Pete Keillor
wrote:

On Sat, 17 Aug 2013 19:03:28 -0700, Larry Jaques
wrote:

On Sat, 17 Aug 2013 12:56:55 -0500, Pete Keillor
wrote:

On Sat, 17 Aug 2013 12:51:54 -0400, Joe Gwinn
wrote:

There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?

Thanks,

Joe Gwinn

You can set that in Agent 6.0 in folder properties. Today, I've been
using the Message-ID field for the new flock of bozos. When I see a
message id containing "theremailer" or "dont-email.me", it's a pretty
good bet for the bin.


How are you doing that, Pete? I only see a Message-ID field in
filtering emails, not Usenet posts.


H key to show the headers, then make a new filter with the message id
in the filter expression like below.

Message-ID: {googlegroups\.com}

The "\" is needed in front of the "." in the regular expression
language so that it isn't interpreted as a wildcard.


Aha! Peachy. I'll give that a try. Thanks. Maybe this will work
for reference headers, too. If I can catch the orig. message (he's
filtered in my reader) when he's being referred to, I can end the spam
created when good folks (guilttripping Gunner, et al) reply to said
spammers.

I've been asking Agent techs (Beck, Gold, Prince) for text (or more
header) filtering for a decade now, to no avail.

--
Truth loves to go naked.
--Dr. Thomas Fuller, Gnomologia, 1732

Joe gwinn August 18th 13 04:13 PM

Need regex code for counting newsgroups
 
In article , DoN.
Nichols wrote:

On 2013-08-17, Joe Gwinn wrote:
There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?


The way I do it is based on the commas in the "Newgsgroups: "
header.

Two newgroups .*,.*
Three newsgroups .*,.*,.*
Four newsgroups .*,.*,.*,.*

Whe ".*" means any number of any characters.
',' means itself -- a plain old comma.


That's what I was trying to remember. I knew there was a simple,
battle-tested solution. Thanks


Exactly how you tell your newsreader to use those varies.


I use Thoth, which uses the Perl regex engine PCRE
http://www.pcre.org/.


I would do something like "-10" for each one of those, so two newsgroups
would be -10, three would be -20, four (or more) would be -30, and set
the auto-kill threshold to -15. (Three is too many in cross-posting.)


Wonder if there is a direct way to count commas and kill if count
exceeds some threshold.


I also use + scores on the "Subject: " header for the very few things
which I want to see which are cross-posted, such as the "What Is It"
weekly puzzle posting thread.


I can also put the subject header test earlier in the processing, and
terminate filter processing for the correct subject.


Joe Gwinn

Joe gwinn August 18th 13 04:31 PM

Need regex code for counting newsgroups
 
In article , James Waldby
wrote:

On Sat, 17 Aug 2013 12:51:54 -0400, Joe Gwinn wrote:

There was a thread some time ago on ways to kill postings with too many
cross-posts. Some of these methods involved regular expressions, most
likely counting commas in the newsgroups header. Can anyone provide a
pointer to the thread, or related threads?


The third link in google for terms
newsgroup filter crosspost count
looks like an r.c.m thread from 31 Aug 2011,
subject = "Agent kill filter help please". See eg
https://groups.google.com/forum/#!topic/rec.crafts.metalworking/P_jb3KpxUaU


I did do that and got flooded with suggestions that I use some other
newsreader, but I was looking for actual regex code, because the reader
I currently use (Thoth) does accept regex, so step one is to try that.


You could also add a term for your specific newsreader.


Hmm. Thoth is uncommon, so I didn't think to try that.

But I just tried it, using "thoth regex cross posting" (without
quotes), and it yielded another regex expression "x-ref matches Regular
Expression" from


http://compgroups.net/comp.sys.mac.s...sted-messages-
in-thoth/1154581 and

http://macusenet.com/archive/index-t-9149.html

Joe Gwinn


All times are GMT +1. The time now is 08:05 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004 - 2014 DIYbanter