[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [chairs] SPAM
Karl: It sounds like DRRW is signing up to write the code for you :-) Problem solved. -Matt On Apr 13, 2004, at 12:27 PM, David RR Webber wrote: > Karl, > > What I was thinking is dumping a simple XML file out with > email address + sequence number - and reading that into > memory as a part of the archive generator - and drive > the search/replace that way. > > Simple thing to update that XML file when a member joins > or changes email address by re-running that dump to > the XML flatfile. > > DW. > > ----- Original Message ----- > From: "Karl F. Best" <karl.best@oasis-open.org> > To: "David RR Webber" <david@drrw.info> > Cc: "Eve L. Maler" <Eve.Maler@Sun.COM>; <chairs@lists.oasis-open.org> > Sent: Tuesday, April 13, 2004 12:20 PM > Subject: Re: [chairs] SPAM > > >> David: >> >> I like the idea. It would require a bit of extra processing for each >> message to match the sender with the Kavi database. The big problem, >> though, is that we don't have membership numbers; the "key" or unique >> identifier in the database is the email address. >> >> But this might have some merit anyway. We'll think about what might be >> possible. >> >> -Karl >> >> >> David RR Webber wrote: >>> Karl, >>> >>> Replacing the address with the OASIS # satisfies your >>> requirement. It's basically impossible since there is >>> no correlation. >>> >>> The only way back is if you have the OASIS membership >>> / number xref list. >>> >>> My guess is you could setup a simple Java program or >>> XSLT script to do this replacement stripping... >>> >>> DW. >>> >>> ----- Original Message ----- >>> From: "Karl F. Best" <karl.best@oasis-open.org> >>> To: "Eve L. Maler" <Eve.Maler@Sun.COM> >>> Cc: <chairs@lists.oasis-open.org> >>> Sent: Tuesday, April 13, 2004 12:10 PM >>> Subject: Re: [chairs] SPAM >>> >>> >>> >>>> Yeah, we thought about something like that, i.e. replacement of the >>>> address with some sort of code. But in order to be effective it >>>> must be >>>> costly (i.e. impossible for a machine, requires a human) to >>>> re-convert >>>> large quantities of addresses, but simple for a human to re-convert >>>> a >>>> single address. >>>> >>>> From the first Slashdot example, at least, it would be simple for a >>>> human to look at the address and create a simple rule for how to >>>> recreate the original. >>>> >>>> -Karl >>>> >>>> p.s. <chuckle> the rotating banner at the top of the Slashdot page >>>> when >>>> I viewed it was an O'Reilly ad for a book on creating spiders and >>>> bots... </> >>>> >>>> >>>> >>>> >>>> Eve L. Maler wrote: >>>> >>>>> Why not just use a mechanistic, but variable, means of disguising >>>>> the >>>>> email address the way Slashdot does? An example appears here: >>>>> >>>>> http://slashdot.org/comments.pl?sid=103884&cid=8848779 >>>>> >>>>> The email link shows up as: >>>>> >>>>> >>>>> mailto: >>>>> heironymouscoward%40yah%5B%20%5Dcom%20%5B'oo.'%20in%20gap%5D >>>>> >>>>> A human can decode this as necessary, but a machine has a much >>>>> tougher >>>>> time. Here's another: >>>>> >>>>> http://slashdot.org/comments.pl?sid=103883&cid=8848358 >>>>> >>>>> The email link shows up as: >>>>> >>>>> mailto:dgorman%40nosPaM.arete.cc >>>>> >>>>> Etc. I believe the engine behind Slashdot is open-source, so maybe > that >>>>> (or part of it, anyway) can be used. Though I wonder about its >>>>> effectiveness if a spammer can locate all the disguise techniques >>>>> in a >>>>> file somewhere... >>>>> >>>>> Eve >>>>> >>>>> Karl F. Best wrote: >>>>> >>>>> >>>>>> Chairs: >>>>>> >>>>>> I'll open another can of worms and jump into this :-) >>>>>> >>>>>> I agree with you wholeheartedly, Duane, that this is a problem. >>>>>> I'll >>>>>> bet that I get more spam than you do (few hundred a day). And I >>>>>> have >>>>>> no doubt that all this is because of spammers harvesting addresses >>>>>> from our list archives. >>>>>> >>>>>> Of course a knee-jerk reaction would be to close off the archives >>>>>> so >>>>>> that nobody can get to them, but given that the OASIS philosophy >>>>>> is >>>>>> openness and accountability we need to keep things open and > accessible. >>>>>> >>>>>> There seems to be two possible solutions: either disguise the >>>>>> addresses stored in the archives, or to somehow block access so >>>>>> that >>>>>> only a human can get through. (I don't think that we want to go >>>>>> down >>>>>> the path of an offensive strategy such as what Duane suggests.) >>>>>> >>>>>> Lacking a foolproof Turing test to allow only human access to the >>>>>> archives, I think the best and easiest solution will probably be >>>>>> to >>>>>> disguise the email addresses attached to each message so that >>>>>> whatever >>>>>> is harvested in unusable by spammers. The disguise would have to >>>>>> be >>>>>> such that the harvester would not be able to accurately or easily >>>>>> recreate the address. Obviously substituting the word "at" for >>>>>> the @ >>>>>> sign isn't going to fool anybody for very long. But whatever we >>>>>> do may >>>>>> not disguise the actual identity of the sender; we need to know >>>>>> who >>>>>> sent the message. >>>>>> >>>>>> A final question is whether it is necessary for a person to be >>>>>> able to >>>>>> respond to a message he found in the archives; i.e. does the guy >>>>>> on >>>>>> the street need to be able to figure out how to respond to Duane >>>>>> when >>>>>> he reads something thet Duane wrote? Perhaps this requirement is >>>>>> not >>>>>> so important, as TC members already know how to respond to the TC >>>>>> list, and the guy on the street is already given instructions for >>>>>> sending a comment to the TC. >>>>>> >>>>>> If the above is acceptable then perhaps I could suggest (and >>>>>> please >>>>>> note, this is just a strawman for discussion, not an official >>>>>> OASIS >>>>>> proposal) that we delete some portion of the address after the @ >>>>>> sign. >>>>>> We could delete all of it, leaving just "duane@", for example, but >>>>>> then we loose any idea about what company Duane was at, whether >>>>>> Yellow >>>>>> Dragon or Adobe (and it may be important for IPR reasons to >>>>>> know). So >>>>>> maybe we could leave the first couple of characters after the @ >>>>>> sign, >>>>>> resulting in "duane@ye" or "duane@ad". If we left three characters >>>>>> then we'd get "sun" and "ibm" etc. which would make it possible to >>>>>> reconstruct the address. But then again with only two we would get >>>>> >>> "hp". >>> >>>>>> So, any comments on whether it should be a requirement for a >>>>>> human to >>>>>> still be able to figure out the email address? And, if that's not >>>>>> a >>>>>> requirement, what do you think of my above suggestion? >>>>>> >>>>>> -Karl >>>>>> >>>>>> p.s. Duane, I hope you don't mind me using you as the example :-) >>>>> >>>> >>>> -- >>>> ================================================================= >>>> Karl F. Best >>>> Vice President, OASIS >>>> office +1 978.667.5115 x206 mobile +1 978.761.1648 >>>> karl.best@oasis-open.org http://www.oasis-open.org >>>> >>>> >>> >>> >>> >>> >> >> >> -- >> ================================================================= >> Karl F. Best >> Vice President, OASIS >> office +1 978.667.5115 x206 mobile +1 978.761.1648 >> karl.best@oasis-open.org http://www.oasis-open.org >> >> > > ___________________________ Matthew MacKenzie Senior Architect IDBU Server Solutions Adobe Systems Canada Inc. http://www.adobe.com/products/server/ mattm@adobe.com +1 (506) 871.5409
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]