WecountheadwordentriesinastandardEnglishdictionary.Thismeansthestandardwordderivationsarenotcounted(forexample,"quickly,"derivedfrom"quick,"doesnotcountasaseparateword).Andwhilecompoundwordsarecounted(like"airconditioning"),phrasesandexpressionsarenot(like"foodforthought").
Thetestworksintwosteps.Thefirststepcontains40wordswhichdetermineyourroughlevel—doyouhavethevocabularyofa3-year-oldora20-year-oldSincewedon'tknowwhoyouare,wecan'tassumeyou'llcheckmostoftheboxes.
Then,withtheapproximateleveldetermined,thesecondstepshowsaround120wordsinfourcolumns,whichareselectedinthegeneralareaaroundwherewethinkyourvocabularylevelis.Ifweguessedperfectly,thefirstcolumnshouldbealmostentirelywordsyouknow,thelastcolumnshouldbealmostentirelywordsyoudon'tknow,andthemiddletwocolumnsshouldbeagradientbetweenthetwo.Ifwedidn'tguessperfectly,westillhaveagoodbuffertogiveyouanaccurateresult.
Sosomeuserswillfindtheycheckmoreboxesoverall,andotherswillleavemoreunchecked,soweleavethedefaulttobeunchecked.Ofcourse,ifyou'realinguisticgeniusandknowallthewords,thenthere'snothingwecando,andyou'lljusthavetoexerciseyourfingermore!
Theyjumpfromeasytohardbecausethat'swherethelimitofyourvocabularyis.It'snotafeatureofourtest,it'safeatureofvocabularylearning.Andwhileitmightseemtoyouthatalltheeasywordsareequallyeasy,wecanassureyouthatotherpeoplewoulddisagree.
Therearemanydifferentfactorsthatgointomeasuringsomeone'svocabularysize—forexample,doyoucount"quick"and"quickly"astwowordsorjustoneAfterall,"quickly"isjustasimpleandpredictablederivationof"quick."
Dependingonhowdifferentquestionslikethesearedecided,vocabularytestscangivewidelydifferentresults.Wetakeaconservativeapproach,andcountthenumberofheadwords(notderivedwords)whichyouareestimatedtoknowinastandarddictionary.Intheend,whatreallymattersisnotyourabsolutenumber,butratheryourscorerelativetootherswhotakethesametest,nomatterhowthetestisputtogether.
You'reprobablyright.Thepercentileslistedsofarareofthepeoplewhohavetakenthequiz,notofthepopulationasawhole.Andtheiraverageself-reportedverbalSATscore,sofar,isaround700(outofaperfect800score).ComparethattotheaverageUSpopulationscoreofaround500,andit'sclearthatourtest-takersarefarmoreliteratethanaverage.
Asthenumberofparticipationsincreases,thereshouldbemoredatatoseparateoutpercentilesbasedondifferentself-reportedSATscores,forexample,andwe'llbeabletousethistogeneratecomparisonscoresthataremorerepresentativeofthepopulationasawhole.
However,basedonourownlimitedinitialtesting,wecangiveyouavagueideaofforeignlanguageacquisitionforBraziliansenrolledinprivateEnglishcourses(generallymeetingforaround3hoursaweek):
Anythingmuchbeyond10,000wordsgenerallyonlycomesfromlivingabroadinanEnglish-speakingcountryforasignificantperiodoftime,orelsespendingtremendousamountsofone'sowntimeexposedtoEnglishmedia(books,sitcoms,movies,etc.).
You'llprobablywantthelemma.numfile,forthefirst6,000orsowords.Ifyou'relookingformoreadvancedEnglish,you'llprobablywanttheall.num.o5file,whichcontainsmorethan200,000entries(althoughmanyofthemareredundant).
Wetry.Onthelastsurveypage(whichasksaboutage,eduction,etc.),wespecificallyaskpeoplenottofillitoutiftheiranswershavebeenlessthantruthful—andourresultsarebasedonlyonnativeEnglishspeakerswhocompletethesurvey.
Butinanycase,sinceourstatisticsarebasedonaggregateresults,weexpectthatanylevelof"exaggeration"willaffectallresults,inaroughlyequalorproportionalway.Andsinceit'snotsomuchtheabsolutevocabularynumberswe'reinterestedin,asmuchasthedifferencesinresultsamonggroups,cheatingshouldn'thavetoomuchofanoveralleffect.Itmightchangetheslopesoroffsetsofresultinggraphsslightly,butshouldn'tbeexpectedtoproduceanyqualitativedifferences.
Ofcourse,yourpersonallevelofunder-orover-confidenceinwordknowledge,comparedtothepublic'sasawhole,willaffectyourscorecomparison.Butwe'retryingtomakethesurveyeasy,fun,andfiveminuteslong,andthereforeworthspreadingaround—insteadofbeingarigoroushalf-hourexamthatguaranteesnocheating.Inthelattercase,wemightnothavegottenanyresultsatall.
PS:Alotofpeoplehavesuggestedthatweinclude"fake"wordstotrytocatchcheaters.Weconsideredthis,butwebelievethiscreatesmoreproblemsthanitsolves.Andafterall,nothinginreallifedependsonyourscore!
Becausehonestly,therereallyaren'tanymoregenerally-usedwordsthanthat.TheOxfordEnglishDictionarymaylist300,000words,butafter45,000,they'reprettymuchalleitherarchaic,scientific/technical,orotherwiseinapplicabletoanykindof"general"vocabularytest.Infact,findingsuchgeneralwordsbeyond35,000wasarealchallenge.
Eachperson'svocabularycounthasamarginoferrorofapproximately±10%.Wealsoroundresultsabove10,000tothenearest100,andresultsabove300tothenearest10.
Wewillproduceresultsforvocabularycorrelationsaccordingtoage,educationlevel,andSATscorefornativespeakersofEnglish,possiblydividedbycountry.However,theseresultswillbebasedsolelyonpeoplewhohavechosentotakethetest,whichisnotacontrolled,representativesampleofthegeneralpopulation.
Wehavechosennottoaskaboutrace,incomelevel,religion,oravarietyofotherfactorsthatcouldbeproduceamorescientificsurvey,intheinterestsofkeepingthesurveypopularandquicktotake.However,controllingforage,educationandgenderalreadygoesalongwaytowardsproducingreliableresults,andweplanoninvestigatingtowhatextentwecancontrolforgeographiclocationandincomelevelintheUSbyusingreportedZIPcodestogetherwithUScensusdata.Insummary:thesurveyisn'tperfect,butshouldneverthelessstillprovidequitemeaningfuldata.Anditshouldcertainlybeabletoindicatewhatareasareworthpursuing(inasubsequentrigorous,controlledscientificsurvey,forexample).
Astrologicalsignsdon'tcoincideexactlywiththemonths,soit'snothingasinterestingasthat!It'sjustsowecancalculatetheagesofyoungerchildrentoalevelmoreprecisethanjusttheyeartheywereborn.
Yes.However,thewordlistweusestaysrelativelythesame,eachtimeyoutakethetest.Soifyoulookupanywordsyou'rebeingtestedon,duringorafterthetest,thenfuturetestresultswillbeartificiallyhigh,andwillnolongerbeaccurate.
Soifyou'dliketousethistesttotrackyourvocabularygrowth,werecommendonlytakingitonceperyearatthemost,andonlyifyou'reverycarefulnevertolookuporaskaboutwordsyousawonthetest.
Also,ifyoutakethetestmultipletimes,pleaseonlyfillouttheresearchsurveyportionthefirsttime.
We'dliketoeventually,butthere'sanawfullotofworkinvolvedincreatingawordlistthatisscientificallyaccurate,becauseithastobebuiltontopofsoliddata.Specifically,foranylanguage,weneedtofind: