Jump to content
GreaseSpot Cafe

Client-Based Search Spiders (Agents)


ChasUFarley
 Share

Recommended Posts

I have a clint who has had me design a directory for them that they plan to publish on the web. It will be in pdf on their web site. They are concerned about internet spiders getting their email addresses and sending them spam.

I did some checking about this on my own and it seems that the email addresses would have to be a hyperlink for the spiders to find them.

Is this correct?

Can I honestly tell my client that the pdf will be just fine and they won't be getting emails about "the little blue pill" tomorrow?

THANKS!

Link to comment
Share on other sites

If you can try making one for your self and see if it works.

I don't know what a spider is but if you can get one to check your fake one see what happens.

here I used the insert emailaddress function from this posting platform here at greasespot.

cmannixfake@cinci.rr.com I put fake in there just in case a spider finds it. If you click this it will start your email program with that address in the to line.

Is that what you mean by hyperlink? Or do you mean like a link to a webpage like this-

http://gscafe.com/groupee

Edited by CM
Link to comment
Share on other sites

quote:
Originally posted by CM:

Steve

You mean they can find it even if it's just typed in anywhere on the web?


Yep.

And that's a brilliant suggestion about the image.

GIF or JPG format are a bit better than PDF. For a project a few years ago I wrote a search engine that searched through and indexed the contents of PDFs.

But a PDF is still better than straight HTML.

Now whadda we gonna do about all these damn acronyms? icon_wink.gif;)-->

Link to comment
Share on other sites

quote:
Originally posted by pawtucket:

Steve or Rick,

What about a java solution?


I remember reading somewhere about a free program that would take your email and convert it to java code so the java code would be posted on your web site - in a browser people could see your email address and if they clicked on it it would work like normal (start the email program and address a message) but the code would not be intelligible to spiders looking to harvest email addresses.

Unfortunately I don't remember where I read it - (PC Mag perhaps?)

Link to comment
Share on other sites

Well, you could use a datbase solution where each persons name has a hyperlink beside it that says email. Click the link, and it returns the record with the email address. You would have to use asp or php though.

***EDIT***

Sorry, you want to use pdf. I forgot about that.

Rick

Link to comment
Share on other sites

Okay, here's what you do: replace the "@" sign with "@" - everything BETWEEN those two double quotes - in every email address in your database. The email addresses will show up perfectly on the website, but they won't be searchable. Make sure you use EVERYTHING - the ampersand (&), then the pound sign (#), then 64, then the semicolon (; ) - 5 characters in all.

Link to comment
Share on other sites

Steve!'s got a good idea using the ascii code for the @. I've been working on a couple projects at work like this. Spammers find email address paths within company intranets this way if employees use them on the internet. Then everyone wonders how they got their email - they just do searches, for @ibm.com, etc. etc and then randomize the front part once they have the path. some get through, some don't.

The Google crawler will definitely find the email addies if they're in there with the @ sign. I've used Bluzemans solution, and set up sql tables on the backend to hold things like email addresses or other data I don't want searchable and then build an app in an asp page with recordset calls to the data. Then it only appears on the page once it loads. . It's pretty simple if you want the code, your server just has to be set up to run asp if it isn't already.

It's an interesting challenge, some good ideas here.

Link to comment
Share on other sites

Wow - you guys are incredible!

I just have one question about the ASCII code in this - I've designed this thing using a word processing program - Quark, to be exact. I feel silly for asking this but won't the ASCII code be text - not code - when I replace the "@" text in the layout? Thus, I would also have the ASCII code in the PDF.... Right?

Link to comment
Share on other sites

Hey Chas, yes, you'd make a mail to hyperlink in adobe acrobat and insert the code in where it wants the email addressed placed, I'm not sure how that's done in Adobe though. I'm not sure if the pdf would display it as code or a hyperlink directly in the text of the page, I'd suspect it would just show the code. I used to make pdf's at work, seems there's something you click on to make mail to's and a properties window where you put the email address. Someone else may know for sure where it is. I think I had ver. 4.0 at that time.

just convert each letter or number in the email address to it's code and spell it out.

here's a quick code reference link if you need one http://www.lookuptables.com/

and a cool converter tool that will do it for you http://www.wbwip.com/wbw/emailencoder.html

pret' cool, just copy paste the result in to whataver adobe acrobat uses to make the mail to hyperlink.

Link to comment
Share on other sites

holy code!

(John - that last link in your post rocks!)

Thanks again to everyone who has helped out with this. The client I'm working with decided to bring the "spider" issue up at the 11th hour and almost put me into labor! I wanted this project off my desk weeks ago but he has kept tweaking this or that... then the email issue. I didn't know what to tell him but knew who would have the straight answers I needed...

Thanks again!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...