Discussion:
[Pydotorg-redesign] how to search the site
Laura Creighton
2003-09-13 14:21:40 UTC
Permalink
One of the reasons I dislike python.org is that you pretty much have
to know where there links you want are, so you can click on them.
You have a handful of bookmarks, and navigate from there. (At least
I do). This is despite having a search engine. For some reason it
never works to display the information I want.

This is in contrast to John Walker's site http://www.fourmilab.ch/
where I can use Google to find anything I want.

My conclusion is that Google does a better job than Infoseek. The
question is, couldn't we get a Google search for Python.org? With
a button to search all the comp.lang.python articles via google
groups as well? If I were Peter Norvig, I would give us one for
free in exchange for a note saying 'Google is powered by Python',
which we wanted to say anyway.

Has this idea come up before? It strikes me as obvious. Will we
end up hurting somebody's feelings at infoseek if we deep-six it?

Laura
A.M. Kuchling
2003-09-13 19:49:07 UTC
Permalink
Post by Laura Creighton
Has this idea come up before? It strikes me as obvious. Will we
end up hurting somebody's feelings at infoseek if we deep-six it?
+1. Google already makes it straightforward to add a form to search
your own site with Google (http://www.google.com/services/free.html).

Google requires no sysadmin effort on our part and provides a better
search. Infoseek is a commercial company that uses Python, which is
good, but wanting to support them is no reason to saddle python.org readers
with weaker search features.

--amk
Simon Willison
2003-09-13 20:30:08 UTC
Permalink
Post by A.M. Kuchling
Post by Laura Creighton
Has this idea come up before? It strikes me as obvious. Will we
end up hurting somebody's feelings at infoseek if we deep-six it?
+1. Google already makes it straightforward to add a form to search
your own site with Google (http://www.google.com/services/free.html).
Google requires no sysadmin effort on our part and provides a better
search. Infoseek is a commercial company that uses Python, which is
good, but wanting to support them is no reason to saddle python.org readers
with weaker search features.
Alternatively, how about building a search engine on top of the
excellent lupy (a port of the open source Lucene Java search engine)?
From my admittedly limited experience of Lupy it is a truly excellent
product - it provides a very powerful API for indexing documemts, and a
simple interface for running searches on them.

I've been trying it out on a collection of 800 documents and searches
complete in 0.02 seconds, running on my Windows desktop PC.

http://www.divmod.org/Lupy/

Regards,

Simon Willison
http://simon.incutio.com/
Roy Smith
2003-09-13 21:06:53 UTC
Permalink
Post by Simon Willison
Alternatively, how about building a search engine on top of the
excellent lupy (a port of the open source Lucene Java search engine)?
From my admittedly limited experience of Lupy it is a truly excellent
product - it provides a very powerful API for indexing documemts, and
a simple interface for running searches on them.
Why would you not want to use google, as previously suggested, other
than NIH (Not Invented Here)? It works, it's easy, it's fast, it's
free, we don't have to maintain it ourselves, everybody is familiar
with it, etc. What's not to like?

What does Lupy give us that google doesn't?
Simon Willison
2003-09-13 21:42:53 UTC
Permalink
Post by Roy Smith
What does Lupy give us that google doesn't?
Lupy gives us three big advantages:

1. Content is indexed as soon as it is added to the site. This is a
critical advantage over Google, which may only reindex once a week or
even once a month.

2. We can customise it (as Tim Parkin has already pointed out).

3. It's written in Python! The official site for the Python programming
language should showcase Python whenever possible. Obviously it is not
worth trading off end user functionality for "powered by Python", but
Lupy really does look like an excellent search engine. Additionally, the
author of the package is likely to be happy to provide extra support and
customisation for the implementation used on the Python.org site.

Best regards,

Simon
Dylan Reinhardt
2003-09-13 23:02:23 UTC
Permalink
<snip>
Post by Simon Willison
3. It's written in Python! The official site for the Python programming
language should showcase Python whenever possible.
Agreed.

According to www.python.org, "Python has been an important part of
Google since the beginning, and remains so as the system grows and
evolves."

I have no idea what the details are on that claim, but showcasing Google
on python.org seems like a Good Thing to me.

$.02,

Dylan
Barry Warsaw
2003-09-13 23:27:32 UTC
Permalink
If I were to cast my vote <wink> I'd go for the thing that takes the
least amount of effort to set up and maintain, that doesn't suck. Bonus
points if we include the mailing list archives as a search corpus.

-Barry
Simon Willison
2003-09-14 02:35:07 UTC
Permalink
Post by Barry Warsaw
If I were to cast my vote <wink> I'd go for the thing that takes the
least amount of effort to set up and maintain, that doesn't suck. Bonus
points if we include the mailing list archives as a search corpus.
I just took a look at the mailing list archives and they total just over
700 MB(!) - the largest is Python-Dev at 111 MB. Loading that lot in to
a search engine could be a painful task. It looks like Google has
indexed them all (incredibly) so a targetted Google search limited to
the mail.python.org domain would probably suffice for mailing lists.

I still think there is a big advantage to be had in rolling a custom
search engine for the site though - the ability to highlight certain
site areas for specific keywords for example. I wonder if it would be
possible to use the Google web services API to power a Python.org search
engine? The API terms and conditions www.google.com/apis/api_terms.html
say this:

"""
The Google Web APIs service is made available to you for your personal,
non-commercial use only (at home or at work) [ ... ] And you may not use
the search results provided by the Google Web APIs service with an
existing product or service that competes with products or services
offered by Google.
"""

I have no idea if a search engine for Python.org would count as
"competing with products or services offered by Google". If it doesn't,
a Google API powered search engine would give us all of the benefits of
Google while still allowing the Python site to apply a custom template
to the results and other enhancements (such as recommeded site areas for
specific keywords).

Cheers,

Simon
Tim Parkin
2003-09-13 21:18:49 UTC
Permalink
Post by Roy Smith
Post by Simon Willison
Alternatively, how about building a search engine on top of the
excellent lupy (a port of the open source Lucene Java search engine)?
From my admittedly limited experience of Lupy it is a truly excellent
product - it provides a very powerful API for indexing documemts, and
a simple interface for running searches on them.
Why would you not want to use google, as previously suggested, other
than NIH (Not Invented Here)? It works, it's easy, it's fast, it's
free, we don't have to maintain it ourselves, everybody is familiar
with it, etc. What's not to like?
The massive advantage of having your own search engine is that you can
customise it to

1) exlude certain parts of the html in a site (ie furniture and menu's)
2) add your own keywords to pages and weight them if nescessary
3) add a category based sub-search
4) provide a better summary text for each returned result
5) provide a result title that isn't the page title
6) add your own stop words

There are more than just these reasons, I've swish / swish++ and lucene
and they are both excellent products. I'm with Simon and think Lupy
would be an appropriate Python search engine and would need little
setup.

Google, whilst very useful, can only provide a vanilla search results
and even simple site based optimisation can dramatically improve these
results.

I'm with Lupy if in any way at all possible. Obviously if google are
willing to donate a search app or help optimise out results then
fantastic.

I would suggest:-

1) Launch with google search
2) Ask google for an optimised solution and if not Add a Lupy search

Tim
Tim Parkin
2003-09-14 11:22:42 UTC
Permalink
Hi all,

I'm getting to the point where I'm knocking out some html for the python
site to see what can be done and can't be done. (at this point I'm sure
everything can be done but to what extent and how efficiently we'll
see).

My big (BIG!!!) question is support for netscape 4.

I'd love to be able to make sure it works visually in netscape 4 but if
we have to do that then we lose an incredible amount of the advantages
of css and semantic html. Basically if we want it to work in netscape 4
then I think we're back to tables (unless someone thinks the design can
be implemented in css layout and still work in netscape, I've tried and
not had much success).

If it's an absolute must then we'll have to try harder or dump the
design. If, like many many other sites on the Internet, we say we can
live as long as the site is usable in netscape, then we're back onto a
winner. This might mean that the site becomes very much a text site (no
images, lynx style layout) or we can play with style sheets to see if we
can come up with an alternative and simpler layout for netscape
(filtered using style sheet hacks).

I'd really like to be able to build the site using full semantic
HTML/css layout as it would make it extrememly fast / accessible and
also get a lot of attention from a lot of web developers.

If you want to see the sort of html I'm talking about and how it renders
in things like Lynx, check out

Shown with enlarged font :
"Developer Works HTML at large font size"
http://pollenation.net/journal/assets/images/DeveloperWorks-ScaledFont.g
if>
"Python Proof of Concept HTML at large font size"
http://pollenation.net/journal/assets/images/PythonProofOfConcept-Scaled
Font.gif

Shown with CSS removed :
"Developer Works HTML at large font size with CSS removed"
http://pollenation.net/journal/assets/images/DeveloperWorks-ScaledFont-N
oCSS.gif
"Python Proof of Concept HTML at large font size with CSS removed"
http://pollenation.net/journal/assets/images/PythonProofOfConcept-Scaled
Font-NoCSS.gif

Shown in Lynx :
"Developer Works HTML in Lynx"
Loading Image...
"Python Proof of Concept HTML in Lynx"
http://pollenation.net/journal/assets/images/PythonProofOfConcept-Lynx.g
if

The css removed is an example of what it would look like in netscape 4
(obviously with a smaller font if that's whats specified) if we don't
aply any alternative styling. If we don't require 100% visual integrity
in netscape 4, then I think we can make netscape look a lot better than
that.

Hope this hasn't just confused people... Just to restate the basic
question

Do we :-

1) Say we need netscape 4 to look exactly like the design and sacrifice
a lot of the advantages of modern CSS etc

2) Say we can compromise on netscape 4 as long as it's usable and
presentable and allow us to use all the modern facilities that css2/dom
etc give us.

My answer is 2 but I think you expected that :-) ....

Tim

----------------------------------------------
Tim Parkin
Managing Director
Pollenation Internet Ltd
www.pollenation.net
m : 07980 59 47 68
t : 01132 25 25 00

-----Original Message-----
From: pydotorg-redesign-***@python.org
[mailto:pydotorg-redesign-***@python.org] On Behalf Of Simon
Willison
Sent: 14 September 2003 02:36
To: Barry Warsaw
Cc: pydotorg-***@python.org
Subject: Re: [Pydotorg-redesign] how to search the site
Post by Barry Warsaw
If I were to cast my vote <wink> I'd go for the thing that takes the
least amount of effort to set up and maintain, that doesn't suck.
Bonus
Post by Barry Warsaw
points if we include the mailing list archives as a search corpus.
I just took a look at the mailing list archives and they total just over

700 MB(!) - the largest is Python-Dev at 111 MB. Loading that lot in to
a search engine could be a painful task. It looks like Google has
indexed them all (incredibly) so a targetted Google search limited to
the mail.python.org domain would probably suffice for mailing lists.

I still think there is a big advantage to be had in rolling a custom
search engine for the site though - the ability to highlight certain
site areas for specific keywords for example. I wonder if it would be
possible to use the Google web services API to power a Python.org search

engine? The API terms and conditions www.google.com/apis/api_terms.html
say this:

"""
The Google Web APIs service is made available to you for your personal,
non-commercial use only (at home or at work) [ ... ] And you may not use

the search results provided by the Google Web APIs service with an
existing product or service that competes with products or services
offered by Google.
"""

I have no idea if a search engine for Python.org would count as
"competing with products or services offered by Google". If it doesn't,
a Google API powered search engine would give us all of the benefits of
Google while still allowing the Python site to apply a custom template
to the results and other enhancements (such as recommeded site areas for

specific keywords).

Cheers,

Simon
Simon Willison
2003-09-14 12:16:53 UTC
Permalink
Post by Tim Parkin
My big (BIG!!!) question is support for netscape 4.
I'd love to be able to make sure it works visually in netscape 4 but if
we have to do that then we lose an incredible amount of the advantages
of css and semantic html. Basically if we want it to work in netscape 4
then I think we're back to tables (unless someone thinks the design can
be implemented in css layout and still work in netscape, I've tried and
not had much success).
Netsape 4's market share is so tiny now as to be insignificant. It
really isn't worth investing much NS 4 specific work. At the same time,
the site has to at least function in NS 4 and preferably not look like
it was created by rank amateurs to users of that browser.

I suggest using the classid dual stylesheets appoach. The basic
stylesheet (as used by Netscape 4) can set font styles, header styles
and anything else that NS4's highly limited CSS can safely support. Then
@import an advanced stylesheet with the real site design in it. A good
example of a site that does this effectively is www.stopdesign.com -
it's amazing how much better a site looks with just some rudimentary
text colours on headings.

However, before finalising any decision along these lines it's vital to
know the current browser usage as experienced by the existing python.org
site. If it turns out for some bizzare reason that NS4 users are a
significant percentage of visitors the site will need to make more of an
effort to provide them with a visually attractive design.

Cheers,

Simon Willison
http://simon.incutio.com/
A.M. Kuchling
2003-09-14 14:09:15 UTC
Permalink
Post by Simon Willison
However, before finalising any decision along these lines it's vital to
know the current browser usage as experienced by the existing python.org
site. If it turns out for some bizzare reason that NS4 users are a
See http://www.python.org/wwwstats/usage_200308.html#TOPAGENTS . From
the full list of agents, various NS4 variants turn out to account for
only 0.6% of accesses to python.org.

--amk
Laura Creighton
2003-09-14 14:55:30 UTC
Permalink
Post by A.M. Kuchling
Post by Simon Willison
However, before finalising any decision along these lines it's vital to
know the current browser usage as experienced by the existing python.org
site. If it turns out for some bizzare reason that NS4 users are a
See http://www.python.org/wwwstats/usage_200308.html#TOPAGENTS . From
the full list of agents, various NS4 variants turn out to account for
only 0.6% of accesses to python.org.
--amk
When you are counting 'US Commercial' are you just looking at .com
endings? We're strakt.com but we're not in the USA...

Laura
Skip Montanaro
2003-09-15 14:39:44 UTC
Permalink
Tim> My big (BIG!!!) question is support for netscape 4.

Why? Look at

http://www.python.org/wwwstats/agent_200308.html

and see if you still think Netscape 4 really matters.

Skip
Simon Willison
2003-09-15 15:17:48 UTC
Permalink
Post by Skip Montanaro
Why? Look at
http://www.python.org/wwwstats/agent_200308.html
and see if you still think Netscape 4 really matters.
It took a while (those stats are pretty hard to figure out) but
eventually I figured that the various versions of Netscape 4 accounts
for 1.17% of hits to the Python site. In case anyone wants to check
themselves, here's the code I used (after first saving the stats from
that page in to a text file):

import re

# Regexp to extract number at start of line
num = re.compile('^(\d+)')

# Load in the lines from the file
lines = fp.open('python-browser-stats.txt').readlines()
# Filter out any lines that don't start with a number
lines = [line for line in lines if num.match(line)]
# Find all lines referring to a Netscape 4 version
netscape = [line for line in lines if
'Mozilla/4' in line and
'compatible' not in line and
'Gecko' not in line]
# Build list of numbers for each NS4 user agent strings
nscounts = [int(num.match(line).groups()[0]) for line in netscape]
# Do the same for ALL user agent strings
allcounts = [int(num.match(line).groups()[0]) for line in lines]
# Now sum the above lists
nstotal = sum(nscounts)
alltotal = sum(allcounts)
# And calculate the percentage
print float(nstotal) / alltotal * 100

The Netscape 4 list comprehension is based on the idea that Netscape 4's
user agent string always contains 'Mozilla/4', but then so does the
string of a number of other browsers. Filtering on 'compatible' removes
Microsoft browsers, and filtering on 'Gecko' removes any gecko variants.

Cheers,

Simon
Tim Parkin
2003-09-14 12:57:44 UTC
Permalink
Post by Simon Willison
I suggest using the classid dual stylesheets appoach. The basic
stylesheet (as used by Netscape 4) can set font styles, header styles
and anything else that NS4's highly limited CSS can safely support.
Then
Post by Simon Willison
@import an advanced stylesheet with the real site design in it. A good
example of a site that does this effectively is www.stopdesign.com -
it's amazing how much better a site looks with just some rudimentary
text colours on headings.
How realistic is it to try to get an alternatre stylesheet that
generates a header and left/right column layout? I would like to be able
to offer this as an alternate, obviously it depends on the html but do
you think this is realistic.

Tim
Simon Willison
2003-09-14 13:13:39 UTC
Permalink
Post by Tim Parkin
How realistic is it to try to get an alternatre stylesheet that
generates a header and left/right column layout? I would like to be able
to offer this as an alternate, obviously it depends on the html but do
you think this is realistic.
It depends on how much work you want to invest in NS4 support.
www.realworldstyle.com has some NS4 friendly templates so it's
definitely possible.

Cheers,

Simon
Tim Parkin
2003-09-14 13:51:56 UTC
Permalink
Post by Simon Willison
Post by Tim Parkin
How realistic is it to try to get an alternatre stylesheet that
generates a header and left/right column layout? I would like to be
able
Post by Simon Willison
Post by Tim Parkin
to offer this as an alternate, obviously it depends on the html but
do
Post by Simon Willison
Post by Tim Parkin
you think this is realistic.
It depends on how much work you want to invest in NS4 support.
www.realworldstyle.com has some NS4 friendly templates so it's
definitely possible.
It's replacing the real style sheet with a netscape friendly
header/2column layout without changing html... I suppose it's just try
it and see...

Tim
Tim Parkin
2003-09-14 15:00:42 UTC
Permalink
Post by Laura Creighton
Post by A.M. Kuchling
See http://www.python.org/wwwstats/usage_200308.html#TOPAGENTS . From
the full list of agents, various NS4 variants turn out to account for
only 0.6% of accesses to python.org.
--amk
When you are counting 'US Commercial' are you just looking at .com
endings? We're strakt.com but we're not in the USA...
Unless you use geoip, or equivalent, the country of origin information
is worse than useless.. And the stats package used doesn't use geoip, or
equivalent.

Did this have a relevance to Netscape 4 or was it just a general
question?

Tim
Laura Creighton
2003-09-14 15:06:26 UTC
Permalink
Post by Tim Parkin
Post by Laura Creighton
Post by A.M. Kuchling
See http://www.python.org/wwwstats/usage_200308.html#TOPAGENTS . From
the full list of agents, various NS4 variants turn out to account for
only 0.6% of accesses to python.org.
--amk
When you are counting 'US Commercial' are you just looking at .com
endings? We're strakt.com but we're not in the USA...
Unless you use geoip, or equivalent, the country of origin information
is worse than useless.. And the stats package used doesn't use geoip, or
equivalent.
Did this have a relevance to Netscape 4 or was it just a general
question?
Tim
Just general. I am making a funding proposal, and would like to make
some sort of guess as to how many python users there are in Europe and
world-wide. I was wondering if this info could be used for that ...

Laura
Tim Parkin
2003-09-14 15:56:02 UTC
Permalink
Post by Laura Creighton
Just general. I am making a funding proposal, and would like to make
some sort of guess as to how many python users there are in Europe and
world-wide. I was wondering if this info could be used for that ...
Don't forget that the stats are only for the main site, I don't
know what hits the mirrors get or how that information might be
made available.

Could prove interesting as this could increase the numbers dramatically
(if it doesn't then we overestimate the importance of the mirrors).

Tim
Laura Creighton
2003-09-14 16:31:25 UTC
Permalink
Post by Tim Parkin
Post by Laura Creighton
Just general. I am making a funding proposal, and would like to make
some sort of guess as to how many python users there are in Europe and
world-wide. I was wondering if this info could be used for that ...
Don't forget that the stats are only for the main site, I don't
know what hits the mirrors get or how that information might be
made available.
Could prove interesting as this could increase the numbers dramatically
(if it doesn't then we overestimate the importance of the mirrors).
Tim
Yeah, great point. musing out loud ... I use a mirror for the docs,
but when I just want to find things on the site, I get lazy and type
python.org a lot....

Laura
Tim Parkin
2003-09-15 16:04:27 UTC
Permalink
Post by Simon Willison
It took a while (those stats are pretty hard to figure out) but
eventually I figured that the various versions of Netscape 4 accounts
for 1.17% of hits to the Python site. In case anyone wants to check
themselves, here's the code I used (after first saving the stats from
I got 1.126% after removing palm browsers that pretend to be nn4 and
also some robots (anything with compatible in). I think there were still
a few in there.. HTTrack pretends to be NN4 too.. Oh and opera does
aswell sometimes, and omniweb and web washer.

Nice to see we're in the same ball park :-)

Tim
Roy Smith
2003-09-16 00:58:02 UTC
Permalink
Post by Simon Willison
Post by Roy Smith
What does Lupy give us that google doesn't?
1. Content is indexed as soon as it is added to the site. This is a
critical advantage over Google, which may only reindex once a week or
even once a month.
Is this really critical? Only an extremely tiny fraction of the site
changes month to month, and new stuff tends to be highlighted in some
kind of 'what's new' section on most web sites anyway.
Post by Simon Willison
2. We can customise it (as Tim Parkin has already pointed out).
We can, but is it worth the trouble?

Also, as a tie-in to the marketing discussion, I think there may be a
certain advantage to using google's stock service rather than something
we customized. Let's say I'm a dev manager, and one of my team leads
is pitching using Python for our latest project. I'm nervous about the
idea, but agreed to at least think about it. I go to the python web
site to see what I can find there. There is something comforting about
seeing a "powered by google" search box on the home page. It's
familiar, it's got positive associations, and it's something I already
know how to use.
Post by Simon Willison
3. It's written in Python! The official site for the Python
programming language should showcase Python whenever possible.
Of course we should. Google uses python. They're even one of our
"reference customers", as the marketing folks would say.

Whatever search engine we end up using, we should have a search box on
every single page (in some standard place as part of the navigation
tools). It's so much more convenient than having to navigate you way
to some special search page.
Simon Willison
2003-09-16 01:16:46 UTC
Permalink
Post by Roy Smith
Whatever search engine we end up using, we should have a search box on
every single page (in some standard place as part of the navigation
tools). It's so much more convenient than having to navigate you way to
some special search page.
I agree completely. A single, simple "Search" box on each page should be
an essential part of the navigation. Any advanced options (such as
"search only in documentation") should be made available only on a
dedicated search page.

Cheers,

Simon
Skip Montanaro
2003-09-16 15:10:45 UTC
Permalink
Post by Simon Willison
1. Content is indexed as soon as it is added to the site.
Roy> Is this really critical? Only an extremely tiny fraction of the
Roy> site changes month to month, and new stuff tends to be highlighted
Roy> in some kind of 'what's new' section on most web sites anyway.

I doubt it. Also, I believe Google's crawling is adapted to the rate of
change it encounters on a site over time. I don't know if that's considered
at the page level, but I suspect so. It allows the crawler to profitably
visit pages which change a lot.

I vote for Google, at least for the time being. Anything else seems like a
lot of effort at this point, for no obvious extra benefit. I'd put a little
search box at the top of the left margin, right below Just's logos.

Skip
Tim Parkin
2003-09-16 16:31:14 UTC
Permalink
Post by Skip Montanaro
Post by Simon Willison
1. Content is indexed as soon as it is added to the site.
Roy> Is this really critical? Only an extremely tiny fraction of
the
Post by Skip Montanaro
Roy> site changes month to month, and new stuff tends to be
highlighted
Post by Skip Montanaro
Roy> in some kind of 'what's new' section on most web sites anyway.
I doubt it. Also, I believe Google's crawling is adapted to the rate
of
Post by Skip Montanaro
change it encounters on a site over time. I don't know if that's
considered
Post by Skip Montanaro
at the page level, but I suspect so. It allows the crawler to
profitably
Post by Skip Montanaro
visit pages which change a lot.
I vote for Google, at least for the time being. Anything else seems
like a
Post by Skip Montanaro
lot of effort at this point, for no obvious extra benefit. I'd put a
little
Post by Skip Montanaro
search box at the top of the left margin, right below Just's logos.
Unless we can get google to provide one of it's commercial site search
facility, the google free search takes up a lot of room. This is
presumably for googles advertising. Also you have to have the google
full web search available aswell. I'm not sure this is what we want on
the site.

Tim

Ps and example of how it might look with the latest design home page.
According to my quick scan of t's and c's we can't modify this in any
way.

Loading Image...
Skip Montanaro
2003-09-16 17:24:47 UTC
Permalink
Tim> Unless we can get google to provide one of it's commercial site
Tim> search facility, the google free search takes up a lot of
Tim> room. This is presumably for googles advertising. Also you have to
Tim> have the google full web search available aswell. I'm not sure this
Tim> is what we want on the site.

I'm not sure what you mean by "takes up a lot of room". We're interested in
searching python.org, right? There's no obligation to display a Google
graphic as far as I know, just that any Google trademarks displayed need a
"TM" sign. Similarly, I don't think we're obligations to provide a web
search option. See attached.

Skip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/pydotorg-redesign/attachments/20030916/cc4ec030/google.html
Todd Grimason
2003-09-16 19:11:40 UTC
Permalink
Well, Python(TM) by Google would actually probably lead to a huge
increase in usage and awareness of Python. Over the objections, of oh,
a few hundred lawyers and millions of others.

It'd get the name Python out there though!

(referring to the way it ends up appearing on the page obviously...)
--
___________________________
toddgrimason****@slack.net
Laura Creighton
2003-09-17 07:17:43 UTC
Permalink
Post by Tim Parkin
Ps and example of how it might look with the latest design home page.
According to my quick scan of t's and c's we can't modify this in any
way.
http://pollenation.net/assets/public/python-google.gif
No dropping the pointsize? I would rather make a 'Google Search'
_Button_ that encorporates the Logo.

While we are at it, I would prefer a larger window to type search
terms in, a button for search just the docs, and just the starship,
and if we have to use that logo, and cannot put it in a button,
I would like it stuck on the bottom right corner of the page, or to
the far right, or somewhere more removed from the Python logo,
which I think is crowded. Perhaps putting it in a box would help,
I don't know.

What do others think? I suggest we make what we want, and ask for
permission.

Laura
Tim Parkin
2003-09-16 19:41:34 UTC
Permalink
Post by Skip Montanaro
I'm not sure what you mean by "takes up a lot of room". We're
interested in
Post by Skip Montanaro
searching python.org, right? There's no obligation to display a Google
graphic as far as I know,
From Googles terms and conditions
Personal Use Only
The Google Services are made available for your personal,
non-commercial use only. You may not use the Google Services
to sell a product or service, or to increase traffic to your
Web site for commercial reasons, such as advertising sales.
You may not take the results from a Google search and reformat
and display them, or mirror the Google home page or results
pages on your Web site. You may not "meta-search" Google.
If you want to make commercial use of the Google Services,
you must enter into an agreement with Google to do so in
advance. Please contact us for more information.
If you are interested in adding a Google search box to
your web site or your company's web site, we encourage
you to do so
The last is a link to the google free search service, this
is the service that can be used for free and the one from
which I quickly (todd :-) drew up the screenshot.

Google are very hard on unauthorised us of their service
even down to removing the offending site from their indexes.

Of course someone may have friends in high places...

Tim
Skip Montanaro
2003-09-16 20:04:29 UTC
Permalink
Skip> I'm not sure what you mean by "takes up a lot of room". We're
Skip> interested in searching python.org, right? There's no obligation
Skip> to display a Google graphic as far as I know,

Tim> From Googles terms and conditions

Google> Personal Use Only
Google> The Google Services are made available for your personal,
Google> non-commercial use only. You may not use the Google Services
Google> to sell a product or service, or to increase traffic to your
Google> Web site for commercial reasons, such as advertising sales.
Google> You may not take the results from a Google search and reformat
Google> and display them, or mirror the Google home page or results
Google> pages on your Web site. You may not "meta-search" Google.
Google> If you want to make commercial use of the Google Services,
Google> you must enter into an agreement with Google to do so in
Google> advance. Please contact us for more information.
Google>
Google> If you are interested in adding a Google search box to
Google> your web site or your company's web site, we encourage
Google> you to do so

Tim> The last is a link to the google free search service, this is the
Tim> service that can be used for free and the one from which I quickly
Tim> (todd :-) drew up the screenshot.

But there's no obligation to use precisely that HTML is there? I assume
you're worried about the "your personal, non-commercial use only" phrase.
If we're in violation of that I doubt it hardly makes any difference whether
we display a Google <img> tag or simply "Google (TM)". Since all we'd be
doing is vectoring people off to Google's site I don't think any of the
other stuff applies ("advertising sales", "increasing traffic",
"meta-search", etc).

Tim> Google are very hard on unauthorised us of their service even down
Tim> to removing the offending site from their indexes.

I know. I wanted to use AdSense on the Mojam and Musi-Cal sites. About 90%
of the page views are generated search results, which they don't allow. I
tried hard to convince them that the ads being displayed were appropriate to
the displayed contents, but they wouldn't budge.

Tim> Of course someone may have friends in high places...

I'm sure Guido does, however I'm still unsure where the problem lies.

Skip
Tim Parkin
2003-09-16 22:16:18 UTC
Permalink
Skip >"your personal, non-commercial use only" phrase.
Skip >If we're in violation of that I doubt it hardly makes any
difference whether
Skip >we display a Google <img> tag or simply "Google (TM)".

"3.3 Attribution. The search box (or other means used by an End User to
enter a search query) shall conspicuously display a graphic (available
at http://www.google.com/stickers.html) that indicates that the Services
are provided by Google. The graphic shall link to the Google site
located at www.google.com or such other address as Google may designate
from time to time during the Term."

I may have misunderstood the previous terms and conditions, it seems
that they are probably happy as long as it's a graphic from the
stickers.html page.

I had heard of people getting into trouble for using the sitesearch
without the websearch and without the graphical images (I hadn't seen
the smaller images previously and I still can't find anywhere that says
you can use text).

I think any search box should also have 'search the www' aswell
(according to t's & c's again) but assuming we don't then I've mocked up
a logo'd version of the home page and if this is possible then I'd be
happy to use it until we set up a Lupy (or equivalent) search customised
to offer a dedicated site search.

Tim

Loading Image...
Skip Montanaro
2003-09-16 22:36:45 UTC
Permalink
Skip> "your personal, non-commercial use only" phrase. If we're in
Skip> violation of that I doubt it hardly makes any difference whether
Skip> we display a Google <img> tag or simply "Google (TM)".

Tim> "3.3 Attribution. The search box (or other means used by an End
Tim> User to enter a search query) shall conspicuously display a graphic
Tim> (available at http://www.google.com/stickers.html) that indicates
Tim> that the Services are provided by Google.
...
Tim> I think any search box should also have 'search the www' aswell
Tim> (according to t's & c's again) but assuming we don't then I've
Tim> mocked up a logo'd version of the home page and if this is possible
Tim> then I'd be happy to use it until we set up a Lupy (or equivalent)
Tim> search customised to offer a dedicated site search.

You must be a customer. You keep changing the requirements. ;-)

See attached.

Skip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/pydotorg-redesign/attachments/20030916/d3508ded/google.html
Laura Creighton
2003-09-17 07:45:43 UTC
Permalink
I posted too soon. Tim has been thinking like me.
Post by Tim Parkin
I think any search box should also have 'search the www' aswell
(according to t's & c's again) but assuming we don't then I've mocked up
a logo'd version of the home page and if this is possible then I'd be
happy to use it until we set up a Lupy (or equivalent) search customised
to offer a dedicated site search.
Tim
http://pollenation.net/assets/public/python-google.png
I'd still prefer the 'Google' IN the Search Button, and I would prefer
the site doc starship newsgroups mailing lists www checkboxes below.
This would give me a nice wide search window, another plus. But
I also want a way to click 'get a more advanced search page'

'External resources' is a bad wording to put on a search box that
is to be read, let alone used, by newbies. They have no idea what
is and what is not external. And so they feel stupid, as if they
were supposed to know this already. This feels worse than not
'knowing what the starship is' -- because it is open ended.

For other searches with that problem I have made a line
'Don't know what any of these are? Just search for them (happy face)'
(Then make sure that a definition/Welcome page was the first hit).
This has the advantage of getting people to use the site...

I am not opposed to spending space on search boxes, if it is useful
space that does stuff and makes searching easier. I like sites that
look as if the people cared more about the searchers than about their
prose. Some sites it's 'yap yap yap, will these people never shut
up and tell me how to find what I want? Admittedly, it is ads that
gets this effect out of me the most, but it is always there'

I think that this has a profound effect on how freindly a site is
found -- respect goes a long way.

Laura
Skip Montanaro
2003-09-17 15:54:16 UTC
Permalink
Laura> I'd still prefer the 'Google' IN the Search Button,

If you do anything to change the Google logo, they'd have to approve. Of
course, you could make it an image button, but then it doesn't look like a
pokable button.

Laura> and I would prefer the site doc starship newsgroups mailing lists
Laura> www checkboxes below. This would give me a nice wide search
Laura> window, another plus. But I also want a way to click 'get a more
Laura> advanced search page'

I think that anything beyond searching the python.org domain or the entire
web should go on a separate search page. As I wrote to Tim in an email
off-list:

In general, people know what simple search boxes are for. If there's no
web vs. site radio button, the implication is that they search the site
(or perhaps all sites in the domain). If there are a pair of radio
buttons they select between searching the web or the site/domain. If
you add other semantics (how will you tell them "search" is a special
token?) I think you'll just confuse them.

Skip
Tim Parkin
2003-09-17 09:19:47 UTC
Permalink
Post by Laura Creighton
Post by Tim Parkin
http://pollenation.net/assets/public/python-google.png
I'd still prefer the 'Google' IN the Search Button, and I would prefer
the site doc starship newsgroups mailing lists www checkboxes below.
This would give me a nice wide search window, another plus. But
I also want a way to click 'get a more advanced search page'
If we use google we won't get any more advanced features or sub searches
etc. It's a wideley discovered fact that only about 0.5% of people use
the advanced search anyway although for those that do it's a very
important feature. Taking this into account, a basic full search should
be available everywhere and an advanced search should be available on
the results page. These advanced search options are the reason I for one
would like to us a dedicated search like lupy. And no we can't change
the size of the google logo and it looked silly inside the button
unfortunately.
Post by Laura Creighton
'External resources' is a bad wording to put on a search box that
is to be read, let alone used, by newbies. They have no idea what
is and what is not external. And so they feel stupid, as if they
were supposed to know this already. This feels worse than not
'knowing what the starship is' -- because it is open ended.
External resources is not the search box, it's a quick link to outside
resources (trying to get rid of all those external links whilst making
them still quickly available) so it obviously needs making more obvious
that it is at the moment. I'll think on that one.
Post by Laura Creighton
I am not opposed to spending space on search boxes, if it is useful
space that does stuff and makes searching easier. I like sites that
look as if the people cared more about the searchers than about their
prose. Some sites it's 'yap yap yap, will these people never shut
up and tell me how to find what I want? Admittedly, it is ads that
gets this effect out of me the most, but it is always there'
As I said I think it's great to have a search box on every page but just
a simple one with room for two or three words. Anything more advanced
can use the 'advanced search' facility, which they should only be using
if they don't get results from the simple search. My tendency now is to
try to make the basic search as useful as possible (with domain specific
results and offering alternatives) instead of forcing people to try to
use an advanced search, athough the advanced search should still be
available.

Tim
Laura Creighton
2003-09-17 10:05:27 UTC
Permalink
Post by Tim Parkin
Post by Laura Creighton
Post by Tim Parkin
http://pollenation.net/assets/public/python-google.png
If we use google we won't get any more advanced features or sub searches
etc. It's a wideley discovered fact that only about 0.5% of people use
the advanced search anyway although for those that do it's a very
important feature. Taking this into account, a basic full search should
be available everywhere and an advanced search should be available on
the results page. These advanced search options are the reason I for one
would like to us a dedicated search like lupy. And no we can't change
the size of the google logo and it looked silly inside the button
unfortunately.
I think you can get one size smaller than the one you are using, unless
I have an old version of your page here. There is also the
'Powered bY Google ' sticker which might look better if we cannot
get it in a button.
Post by Tim Parkin
Post by Laura Creighton
'External resources' is a bad wording to put on a search box that
is to be read, let alone used, by newbies. They have no idea what
is and what is not external. And so they feel stupid, as if they
were supposed to know this already. This feels worse than not
'knowing what the starship is' -- because it is open ended.
External resources is not the search box, it's a quick link to outside
resources (trying to get rid of all those external links whilst making
them still quickly available) so it obviously needs making more obvious
that it is at the moment. I'll think on that one.
I'm sorry I was not clear. Let me try again.

I am vehemently opposed to the words 'external resources'. This is
because when a newbie shows up, wants to search something, and
sees the box, they will say 'what is an external resource'. Either
they will be completely blank, and the notion of 'some resources are
external to python.org and some are not' is a new concept which they
do not know, yet, or they are familiar with that concept but have no
idea what stuff we have here, and what lives other places (like
pythonology.org) and whether we consider the mailing lists as part
of python.org or not.

When you are using a search form, you are saying 'I don't know where
this stuff is'. Thus when you first need to know something that
you don't know, the experience is very frustrating, often frustrating
enough to make complete newbies _leave the site_.

Searching 'other python websites not connected to python.org' is
a better option. (Plus it will encourage others to link to us
for this service, which means we get automatic updates on lists
of all python resources. This is good.)

I think searching the python cookbook would be wonderful. And since
I hate the way ActiveState's search works, too, I'd love a
different interface to use it.

Seaching 'the vaults' might be good, though not if we intend to
go through the lot and make them Packages. Then a making a
'how to find a package' one click away makes sense, but I still
don't know what our intentions are w.r.t. packages.

(I'd love to discuss that, but I also have to catch a train to
Stockholm, be back Thursday late. )

Which of these things to search belongs on an advanced search form,
and which belongs on the box, needs to be thought about. First
we need the list of categories.
Post by Tim Parkin
As I said I think it's great to have a search box on every page but just
a simple one with room for two or three words. Anything more advanced
can use the 'advanced search' facility, which they should only be using
if they don't get results from the simple search. My tendency now is to
try to make the basic search as useful as possible (with domain specific
results and offering alternatives) instead of forcing people to try to
use an advanced search, athough the advanced search should still be
available.
I don't see why '2 or 3 words' is connected to whether you need an advanced
search. I often search for exact phrases. Whenever I see a small
search box, I always wonder 'do these people simply not care about
usability, or am I really all that weird when it comes to searches?'

Of course, from my point of view, there isn't anything that you could
use the space for that I would like as much as enough room to type in
'Alex Martelli' + 'Hotel Belfiori' or other things of that length
that I want to search on. (That's Alex's article that explains
what a reference is in a neat way, by the way).

Laura
Simon Willison
2003-09-17 10:17:11 UTC
Permalink
The best advice I have ever seen on search engine usability is in Steve
Krug's excellent book "Don't Make Me Think". There are two possible
options for a search interface (apologies for the poor ASCII art):

Search: |________| |Go!|

or

|________| |Search|

Any other wording will cause people to have to stop and think, which is
bad. Additional options etc are fine, but they should be hidden away on
the advanced search page. The labels above are particularly important -
the single word search MUST be included and the action word "Go!" should
be used on teh button only if the word "Search" has already been used.

I would suggest a possible third way that makes use of a snippet of
javascript, based on a modification of the first method shown above: put
the word Search /in/ the search box (greyed out slightly) and have it
disappear when the user first focuses on the box. Using the labels.js
technique[1], this can be achieved without making the interface more
confusing for users with javascript disabled.

All usability advice should be taken as guidelines rather than absolute
rules, but in this case the guideline is such a good once I've never
seen any reason to deviate from it[1].

Cheers,

Simon
http://simon.incutio.com/

[1] Real nit-pickers may notice my blog uses "Search Site" rather than
just "Search". Rules are made to be broken ;)
Tim Parkin
2003-09-17 11:16:00 UTC
Permalink
Post by Simon Willison
The best advice I have ever seen on search engine usability is in Steve
Krug's excellent book "Don't Make Me Think". There are two possible
Absolutely... Heres a few other resources that support that...

http://www.useit.com/alertbox/20010513.html
http://world.std.com/~uieweb/observng.htm
http://www.currybet.net/articles/day_in_the_life/3.shtml

The also insist that advanced searches are hugely less important than
developers think they are. Although I removed the {go) in that last mock
up it was as a result of playing around with a google button (which
didn't work anyway) and it's back to 'go' again now.
Post by Simon Willison
I would suggest a possible third way that makes use of a snippet of
put
Post by Simon Willison
the word Search /in/ the search box (greyed out slightly) and have it
disappear when the user first focuses on the box. Using the labels.js
technique[1], this can be achieved without making the interface more
confusing for users with javascript disabled.
That's the approach I've taken and it seems to work in practise also.

Tim
Roy Smith
2003-09-17 13:48:06 UTC
Permalink
Why do we need different buttons to search different parts of the
collection? "search docs" vs. "search whole web site" vs. "search
starship" implies that the searcher already knows where the stuff is.
Most people won't. Roy's Rule of UI Design says "every option you give
the user is just another chance for them to do the wrong thing".

If we really have to have these options, at least they should be tucked
away on the "advanced search" page. The default interface (on every
page) should be just a simple box into which you can type a couple of
words, and a single "Search" button to click. If people insist, also a
discrete link to the advanced search page.

On a slightly different tangent, I'm a little concerned about "cute"
names like "Starship" and "Vaults of Parnassus". One of our goals is
to appeal to dev managers. Dev managers don't like cute. Cute is the
antithesis of "this will help me deliver my product on time and keep my
job".
Post by Laura Creighton
While we are at it, I would prefer a larger window to type search
terms in, a button for search just the docs, and just the starship,
Fred L. Drake, Jr.
2003-09-23 19:33:43 UTC
Permalink
Post by Roy Smith
On a slightly different tangent, I'm a little concerned about "cute"
names like "Starship" and "Vaults of Parnassus". One of our goals is
But these aren't names "we" (meaning the python.org maintainers) have
created.
Post by Roy Smith
to appeal to dev managers. Dev managers don't like cute. Cute is the
antithesis of "this will help me deliver my product on time and keep my
job".
Understood.

Note that the Vaults will (likely) be replaced by the Python Package
Index (PyPI; www.python.org/pypi/) at some point in the future.
(Well, I think it should be, at any rate.) Just when this makes sense
depends on the growth of the PyPI database.


-Fred
--
Fred L. Drake, Jr. <fdrake at acm.org>
PythonLabs at Zope Corporation
Laura Creighton
2003-09-23 20:10:52 UTC
Permalink
Post by Fred L. Drake, Jr.
Post by Roy Smith
On a slightly different tangent, I'm a little concerned about "cute"
names like "Starship" and "Vaults of Parnassus". One of our goals is
But these aren't names "we" (meaning the python.org maintainers) have
created.
Post by Roy Smith
to appeal to dev managers. Dev managers don't like cute. Cute is the
antithesis of "this will help me deliver my product on time and keep m
y
Post by Roy Smith
job".
Understood.
Note that the Vaults will (likely) be replaced by the Python Package
Index (PyPI; www.python.org/pypi/) at some point in the future.
(Well, I think it should be, at any rate.) Just when this makes sense
depends on the growth of the PyPI database.
-Fred
If you want the PyPI database to grow, you need to put a pointer to the
index out there which tells people how to make such a package. I
still am unsure how, and there are lots of us in that boat. Then it
would be worthwhile to go through the vaults and make a package of
each thing there. I believe that there are lots of people in the
community who would love to help and would do this -- as long as it
was just plain work, and didn't require them to make judgements about
packages that they don't know anything about.

Laura
Jeremy Hylton
2003-09-25 01:09:06 UTC
Permalink
Post by Laura Creighton
If you want the PyPI database to grow, you need to put a pointer to the
index out there which tells people how to make such a package. I
still am unsure how, and there are lots of us in that boat.
I hope this helps:
http://www.python.org/~jeremy/weblog/030924.html

Jeremy

Roy Smith
2003-09-17 13:56:12 UTC
Permalink
Post by Tim Parkin
If we use google we won't get any more advanced features or sub
searches
etc. It's a wideley discovered fact that only about 0.5% of people use
the advanced search anyway although for those that do it's a very
important feature.
This may sound harsh, but I think a feature which appeals to 0.5% of
our customers should have no bearing on our design. It's just not
worth worrying about. Not to mention that worrying about it distracts
you from worrying about the real issues which affect 90% of our
customers.
Stephan Deibel
2003-09-17 14:21:38 UTC
Permalink
Could we please stop cross-posting to both marketing-python and
pydotorg-redesign? This discussion should be on the latter, as it's
specifically about the design of the website.

For newer members: We've had to split off different discussions to keep
volume manageable for people interested in specific marketing-related
projects. There's a list of all the groups here:

http://pythonology.org/mailman/listinfo/marketing-python

Thanks,

- Stephan
Skip Montanaro
2003-09-17 16:05:44 UTC
Permalink
Roy> This may sound harsh, but I think a feature which appeals to 0.5%
Roy> of our customers should have no bearing on our design. It's just
Roy> not worth worrying about. Not to mention that worrying about it
Roy> distracts you from worrying about the real issues which affect 90%
Roy> of our customers.

Agreed. What about the other 9.5%? ;-)

Skip
Roy Smith
2003-09-17 14:06:46 UTC
Permalink
Post by Simon Willison
[1] Real nit-pickers may notice my blog uses "Search Site" rather than
just "Search". Rules are made to be broken ;)
You made an excellent point that the UI has to be simple and standard,
so I'm confused as to why you then break your own rule? Does "Search
Site" parse as <verb> <direct object> or as <adjective> <noun>?
I.e., does it mean "Click here to search this site", or "Click here to
go to the site at which I can do searches?"
Simon Willison
2003-09-17 14:16:09 UTC
Permalink
Post by Roy Smith
Post by Simon Willison
[1] Real nit-pickers may notice my blog uses "Search Site" rather than
just "Search". Rules are made to be broken ;)
You made an excellent point that the UI has to be simple and standard,
so I'm confused as to why you then break your own rule? Does "Search
Site" parse as <verb> <direct object> or as <adjective> <noun>?
I.e., does it mean "Click here to search this site", or "Click here to
go to the site at which I can do searches?"
The reason is about as shallow as you can get. Just the word "Search"
looked unbalanced compared to the other headers in that column, which
each have several words in them. I also figure that my site's audience
are smart enough to spot a search feature like that from 600 paces.

Cheers,

Simon
Tim Parkin
2003-09-17 14:34:02 UTC
Permalink
Post by Roy Smith
This may sound harsh, but I think a feature which appeals to 0.5% of
our customers should have no bearing on our design. It's just not
worth worrying about. Not to mention that worrying about it distracts
you from worrying about the real issues which affect 90% of our
customers.
It depends on how important that 0.5% is and how 'replaceable' the
functionality is.

If someone can't access the internet apart from via Links then we should
offer lynx support even if it only amounts to 0.0005% of the browser
usage.

If there are alternatives that they could us then it's a 'preference'
that they use it and not a 'requirement'.

However, there is also a factor to take into account as to how important
people see the functionality. In the case of Lynx (and links), a lot of
users see it as very important that they can access a website using it.

Finally there is how easy it is to implement the change. If the website
is built to support disabled users, then the site will work by default
in Lynx, hence it isn't relevant to decide whether or not to support it.

With advanced searching, most of the work would have already been done
to make sure the simple search is suitably effective. Eg a simple search
should recognise different domains (mail. www. Or docs / turorials) and
report the results like this. So If I search for 'documentation' I get a
link straight to the docs home page. The advanced search is just a more
explicit version of this.

So we have a low demand, a high importance and an ease of
implementation. The advanced search can be an option so we don't have to
worry about confusing users so there is they aren't mutually exclusive.

Tim
Loading...