View Single Post
(#1 (permalink))
Old
steven (Offline)
JF Old Timer
 
Posts: 544
Join Date: Apr 2010
Using Google as a Language Tool - 10-19-2010, 02:33 AM

Before I get started, I want to clarify that I'm not talking about "Google Translate" or anything like that.

What I am talking about is using google to look up a phrase, which should be unput with quotes ("") for accuracy, to see how many results (or hits) you get. These results can be compared to the results of a similar phrase.

One reason for doing this is to figure out which phrases are used more than others. Another reason is to check the accuracy of grammar. For example, inputing a phrase into google you are not sure about to check if you get any results. This can be useful for checking particle use or word use, when specifically talking about Japanese.

However, this method certainly does not explain why. It also doesn't really reveal any nuances (which can be a particular issue with particles). Sometimes you also end up with results that can go either way statistically speaking. Also, the quality of the results can easily be overlooked. Bad quality "hits" will include inquiries about the grammar in question-- "is ____ correct?". Other bad hits include those from foreign websites (or domestic websites with foreign writers). Another thing is, I believe that the internet portrays mostly "written language". However, I think people often write as they speak on the internet as well, which makes it kind of a "gray area". That's not to mention "internet-only" language.

To be honest, I find myself using it as a tool to teach English to other people more than I use it to teach myself Japanese. In doing so I have noticed a lot of the "bad sides" of using Google as a language tool. However, I think if used correctly it is a great tool.

I'm sure I'm not the only one who does this. For those of you who also do this out there, I want to ask you a question:

Have you noticed that the "number of results" can sometimes plummet as the page number rises? For example, Google might initially tell you that you have over a million results for a phrase, but once you get to page 40-60 you find out that there are only around 400 or so. Then it has the "show repeat results" function. I think that some of those can be considered "bad results" as they are often repeats (like quoted forum posts and stuff like that). However, even with "repeat results" you are still stuck with maybe 500 or so. What's more, if you look at the top right of the page, it still says something like "showing 500 of 1,000,000 results".

I have thought of some possible reasons for this. I want to share some of them to hear what you guys think about them.

1. One reason for this could be that results might "vary" depending on the country in which the searcher resides.

2. I also thought that maybe google only shows the first few hundred of certain inquiries (which doesn't really make sense as I've gone through many pages of some results to test this idea).

3. One more reason that I thought of is that maybe Google wants to make it look like they have more results than they actually have. I can't really think of a reason for this though... maybe it looks good for investors??

4. Another reason could be that maybe google simply doesn't want to show all their results. Maybe they only give that kind of "data" to certain groups (or maybe give could be replaced with "sell").

These are the biggest reasons I can think of. If anyone has any other reasons that they can think of I'd like to hear them. (especially since I tend to always overlook the most obvious things).

Since I can't really test #1 on my own, I'd like to ask for some help.

Here is an example: "ride the bus to school"
"ride the bus to school" - Google Search

That is a link to as far as I could go. It says "Results 541-543 of about 624,000 for "ride the bus to school" at the top of the page.

I'd like to know if people in other countries get further than I can (I live in Japan).

I think I could post a lot more about using Google as a language tool, but I'll stop short for now in hopes to get some feedback. I'd appreciate any feedback on this-- especially if you know some kind of work-around to this issue.

Thanks,
Steven
Reply With Quote