SoLoR Posted April 3, 2010 Posted April 3, 2010 (edited) OK im trying to make some parsing magic. I have strings like this: hfList = [{fixid:'410273',product:'.NET Framework 3.0 - Win7, Windows Server 2008 R2 \x28CBS\x29',language:'All \x28Global\x29',langcode:'intl',platform:'x86',release:'sp2',filename:'DevDiv879341',version:'NDP 3.0',build:'30729.5020',size:'4382033',credate:'3\x2f31\x2f2010 10\x3a41\x3a36 PM',moddate:'3\x2f31\x2f2010 10\x3a41\x3a36 PM'},{fixid:'410225',product:'.NET Framework 3.0 - Windows Server 2003, Windows XP \x28MSI\x29',language:'All \x28Global\x29',langcode:'intl',platform:'x86',release:'sp2',filename:'DevDiv866275',version:'NDP 3.0',build:'30729.4519',size:'8630148',credate:'3\x2f29\x2f2010 10\x3a55\x3a52 PM',moddate:'3\x2f29\x2f2010 10\x3a55\x3a52 PM'},{fixid:'410274',product:'.NET Framework 3.0 - Win7, Windows Server 2008 R2 \x28CBS\x29',language:'All \x28Global\x29',langcode:'intl',platform:'x64',release:'sp2',filename:'DevDiv879341',version:'NDP 3.0',build:'30729.5020',size:'6050502',credate:'3\x2f31\x2f2010 10\x3a42\x3a38 PM',moddate:'3\x2f31\x2f2010 10\x3a42\x3a38 PM'},{fixid:'410226',product:'.NET Framework 3.0 - Windows Server 2003, Windows XP \x28MSI\x29',language:'All \x28Global\x29',langcode:'intl',platform:'x64',release:'sp2',filename:'DevDiv866275',version:'NDP 3.0',build:'30729.4519',size:'14043927',credate:'3\x2f29\x2f2010 10\x3a57\x3a31 PM',moddate:'3\x2f29\x2f2010 10\x3a57\x3a31 PM'}];Its obvious what i want here i want to make some kind of parsing algorithem (in perl i guess, so i can use it on my linux box), that would extract data i want, like product, platform, date etc etc... This is for parsing MS hotfix database. I currently have this as shell script:#!/bin/shi=981107while [ $i -le 981110 ]do wget --cookies=on --load-cookies=cookie.txt http://support.microsoft.com/hotfix/KBHotfix.aspx?kbnum=${i}&kbln=en-us wait cat KBHotfix.aspx?kbnum=${i} | grep hfList > ${i} wait rm KBHotfix.aspx?kbnum=${i} if [ ! -s ${i} ]; then rm ${i} fi (( i++ ))doneThis gives me files (with KB number) with above string. I just need to parse them properly to make database/table from all the data. Soo any ideas ? eventualy i will try to make, to send me notice what updates are new etc and ofc put it in crontab to go thru MS hotfix database once or twice/week. But first thing is parsing thingie... if ill need to google and trial and error method, it will take loong time. Edited April 3, 2010 by SoLoR Quote
ricktendo Posted April 3, 2010 Posted April 3, 2010 Legolash2o (creator of Win7toolkit) is great at this stuff, I recall he made one that went to that exact URL and checked if the hotfix was for Vista SP1I have also requested a tool that would do exactly that...also differenciate if the hotfix is for Windows 7, Vista, XP, SP?, etc. Quote
SoLoR Posted April 3, 2010 Author Posted April 3, 2010 Yes ofc, goal is to create one big database/table with everything and you can just filter then what you want/need. For start i just want to do simple table seperated with tabs, so i can import it to excel and sort by date. For example goal is something like this: http://www16.atpages.jp/~hotfix/ just first of all with all fixes, preferably as mysql database and custom filters.... oh well somehow i feel with my current knowlage in coding ill be doing this **** for few months, since i need to learn basicaly from scratch everything Quote
Legolash2o Posted April 4, 2010 Posted April 4, 2010 (edited) lol thats what i created Windows Updater for, everytime u released an update i would add it to the SQL database and whoever did not have it installed would get get showed the new updates. If i can help, let me know though. I use VB .NET @Ricktendo64I remember that program now, it took forever to go through all KB articles though Edited April 4, 2010 by Legolash2o Quote
SoLoR Posted April 4, 2010 Author Posted April 4, 2010 (edited) Why im doing this is because KB hotfixes are posted sooner then KB articles are created. For now i found "easy way out". Parsing articles with "Win7" or "Windows 7" in one directory and then diff it with "old" one. This does not give me database, but i do know what is new and whats not Im currently in middle of parsing hotfixes from 970000 to 983000 to see how long it will take. I know this does not include all but will 99.9% include all new ones and if i miss one every full moon then so be it. Its doing with aprox 1 article/sec so its a bit under 4h for whole database. What ill do is crontab it twice/week and then send me diffs to my mail... Thing is that "big" project i had in mind, we should have full 24/7 server and everything. Doing something this size, for someone runing on his home computer seems kinda pointless.But ye general thing is you download hotfix download page (you need to provide coockie where you agree with that license agreement), then you search for hfList variable inside, if it exists it means that article have public hotfixes and everything that is needed is in that hfList, like OS/dates/architecture etc etc. Edited April 4, 2010 by SoLoR Quote
SoLoR Posted April 4, 2010 Author Posted April 4, 2010 (edited) 04.04.2010 13:55 16.128 97041304.04.2010 13:55 1.625 97048604.04.2010 13:55 2.930 97092404.04.2010 13:55 1.480 97160104.04.2010 13:55 2.918 97198804.04.2010 13:55 2.137 97225104.04.2010 13:55 1.414 97255104.04.2010 13:55 2.225 97258304.04.2010 13:55 591 97283104.04.2010 13:55 2.840 97284804.04.2010 13:55 855 97306204.04.2010 13:55 1.669 97352904.04.2010 13:55 2.914 97374604.04.2010 13:55 882 97375104.04.2010 13:55 1.996 97397504.04.2010 13:55 593 97403904.04.2010 13:55 3.014 97406504.04.2010 13:55 567 97409004.04.2010 13:55 3.012 97416804.04.2010 13:55 16.529 97417604.04.2010 13:55 590 97420404.04.2010 13:55 587 97425904.04.2010 13:55 1.129 97432404.04.2010 13:55 3.006 97437204.04.2010 13:55 871 97447604.04.2010 13:55 1.986 97447704.04.2010 13:55 876 97456004.04.2010 13:55 589 97459804.04.2010 13:55 879 97462404.04.2010 13:55 304 97463604.04.2010 13:55 594 97463804.04.2010 13:55 873 97463904.04.2010 13:55 303 97467204.04.2010 13:55 910 97467404.04.2010 13:55 302 97471904.04.2010 13:55 304 97490904.04.2010 13:55 587 97491204.04.2010 13:55 592 97493004.04.2010 13:55 589 97494004.04.2010 13:55 589 97494404.04.2010 13:55 589 97505304.04.2010 13:55 2.900 97507604.04.2010 13:55 836 97514204.04.2010 13:55 882 97521404.04.2010 13:55 867 97524304.04.2010 13:55 1.672 97533204.04.2010 13:55 306 97535104.04.2010 13:55 1.161 97535404.04.2010 13:55 301 97536004.04.2010 13:55 1.690 97536304.04.2010 13:55 1.680 97541504.04.2010 13:55 591 97546904.04.2010 13:55 879 97549604.04.2010 13:55 589 97549904.04.2010 13:55 885 97550004.04.2010 13:55 1.666 97551204.04.2010 13:55 304 97553004.04.2010 13:55 589 97553504.04.2010 13:55 593 97553804.04.2010 13:55 873 97559904.04.2010 13:55 879 97561704.04.2010 13:55 1.732 97561904.04.2010 13:55 872 97562004.04.2010 13:55 1.679 97568004.04.2010 13:55 873 97568804.04.2010 13:55 879 97574104.04.2010 13:55 587 97576204.04.2010 13:55 305 97576304.04.2010 13:55 879 97577704.04.2010 13:55 879 97577804.04.2010 13:55 591 97580604.04.2010 13:55 595 97585104.04.2010 13:55 873 97585804.04.2010 13:55 873 97592104.04.2010 13:55 2.942 97595404.04.2010 13:55 881 97599204.04.2010 13:55 1.137 97603804.04.2010 13:55 873 97609004.04.2010 13:55 305 97609304.04.2010 13:55 306 97609604.04.2010 13:55 879 97609904.04.2010 13:55 1.990 97611704.04.2010 13:55 879 97618704.04.2010 13:55 587 97621004.04.2010 13:55 1.690 97624004.04.2010 13:55 879 97629604.04.2010 13:55 1.393 97637304.04.2010 13:55 879 97639804.04.2010 13:55 879 97639904.04.2010 13:55 597 97641704.04.2010 13:55 873 97641804.04.2010 13:55 882 97641904.04.2010 13:55 879 97642204.04.2010 13:55 1.181 97642504.04.2010 13:55 879 97642704.04.2010 13:55 1.690 97643804.04.2010 13:55 882 97644304.04.2010 13:55 996 97646204.04.2010 13:55 595 97648304.04.2010 13:55 306 97648404.04.2010 13:55 888 97649404.04.2010 13:55 881 97652504.04.2010 13:55 875 97652704.04.2010 13:55 595 97652804.04.2010 13:55 888 97657104.04.2010 13:55 874 97658604.04.2010 13:55 595 97662704.04.2010 13:55 304 97665504.04.2010 13:55 881 97665804.04.2010 13:55 879 97674604.04.2010 13:55 885 97675504.04.2010 13:55 1.678 97675904.04.2010 13:55 593 97678104.04.2010 13:55 591 97678204.04.2010 13:55 879 97688304.04.2010 13:55 882 97688704.04.2010 13:55 3.025 97689804.04.2010 13:55 880 97691004.04.2010 13:55 593 97697204.04.2010 13:55 879 97701504.04.2010 13:55 999 97702004.04.2010 13:55 595 97703604.04.2010 13:55 881 97706704.04.2010 13:55 304 97706804.04.2010 13:55 2.664 97706904.04.2010 13:55 597 97707104.04.2010 13:55 873 97709604.04.2010 13:55 885 97713204.04.2010 13:55 881 97715804.04.2010 13:55 1.735 97717804.04.2010 13:55 880 97718004.04.2010 13:55 1.998 97718204.04.2010 13:55 303 97718404.04.2010 13:55 879 97718604.04.2010 13:55 873 97722204.04.2010 13:55 885 97730704.04.2010 13:55 599 97731404.04.2010 13:55 879 97734204.04.2010 13:55 877 97734604.04.2010 13:55 879 97734804.04.2010 13:55 873 97735304.04.2010 13:55 880 97735704.04.2010 13:55 888 97737504.04.2010 13:55 876 97738004.04.2010 13:55 9.619 97738104.04.2010 13:55 886 97739204.04.2010 13:55 595 97739704.04.2010 13:55 1.200 97741304.04.2010 13:55 593 97741504.04.2010 13:55 590 97741704.04.2010 13:55 882 97741904.04.2010 13:55 1.002 97742004.04.2010 13:55 885 97754204.04.2010 13:55 306 97757004.04.2010 13:55 879 97757904.04.2010 13:55 886 97760804.04.2010 13:55 591 97760904.04.2010 13:55 12.556 97761504.04.2010 13:55 595 97761704.04.2010 13:55 591 97762004.04.2010 13:55 887 97762704.04.2010 13:55 589 97763204.04.2010 13:55 1.998 97764304.04.2010 13:55 885 97764804.04.2010 13:55 304 97769204.04.2010 13:55 877 97769504.04.2010 13:55 16.568 97774804.04.2010 13:55 873 97778704.04.2010 13:55 595 97783204.04.2010 13:55 1.008 97786604.04.2010 13:55 568 97789404.04.2010 13:55 591 97791104.04.2010 13:55 883 97794404.04.2010 13:55 886 97797704.04.2010 13:55 867 97800004.04.2010 13:55 594 97803404.04.2010 13:55 1.676 97804204.04.2010 13:55 874 97804804.04.2010 13:55 304 97805504.04.2010 13:55 874 97811604.04.2010 13:55 589 97811804.04.2010 13:55 16.421 97812504.04.2010 13:55 10.492 97815504.04.2010 13:55 1.672 97815704.04.2010 13:55 589 97820504.04.2010 13:55 587 97820604.04.2010 13:55 3.032 97824904.04.2010 13:55 3.032 97825404.04.2010 13:55 587 97825804.04.2010 13:55 591 97826504.04.2010 13:55 874 97827704.04.2010 13:55 873 97833004.04.2010 13:55 885 97834704.04.2010 13:55 591 97838704.04.2010 13:55 883 97843304.04.2010 13:55 1.991 97845404.04.2010 13:55 879 97846204.04.2010 13:55 592 97847604.04.2010 13:55 1.678 97849104.04.2010 13:55 867 97850004.04.2010 13:55 874 97851604.04.2010 13:55 2.328 97852004.04.2010 13:55 873 97852604.04.2010 13:55 588 97852704.04.2010 13:55 595 97852904.04.2010 13:55 879 97853504.04.2010 13:55 592 97856204.04.2010 13:55 873 97857104.04.2010 13:55 589 97863204.04.2010 13:55 873 97871404.04.2010 13:55 873 97873804.04.2010 13:55 867 97883704.04.2010 13:55 879 97883804.04.2010 13:55 881 97886904.04.2010 13:55 873 97888404.04.2010 13:55 870 97891704.04.2010 13:55 873 97891804.04.2010 13:55 588 97894304.04.2010 13:55 874 97897704.04.2010 13:55 996 97898104.04.2010 13:55 873 97898204.04.2010 13:55 873 97903904.04.2010 13:55 1.686 97910104.04.2010 13:55 879 97914004.04.2010 13:55 875 97915504.04.2010 13:55 303 97921404.04.2010 13:55 867 97922304.04.2010 13:55 589 97922404.04.2010 13:55 1.670 97924104.04.2010 13:55 587 97929404.04.2010 13:55 880 97935004.04.2010 13:55 591 97936604.04.2010 13:55 587 97937304.04.2010 13:55 588 97937404.04.2010 13:55 879 97938304.04.2010 13:55 876 97942504.04.2010 13:55 876 97944304.04.2010 13:55 882 97944404.04.2010 13:55 873 97947004.04.2010 13:55 879 97949104.04.2010 13:55 873 97949504.04.2010 13:55 303 97952404.04.2010 13:55 873 97953004.04.2010 13:55 879 97953204.04.2010 13:55 2.002 97953304.04.2010 13:55 867 97953804.04.2010 13:55 873 97954304.04.2010 13:55 1.127 97954804.04.2010 13:55 2.007 97956204.04.2010 13:55 302 97956404.04.2010 13:55 587 97956704.04.2010 13:55 876 97958004.04.2010 13:55 873 97961904.04.2010 13:55 873 97964304.04.2010 13:55 588 97968104.04.2010 13:55 587 97971104.04.2010 13:55 3.028 97974404.04.2010 13:55 587 97974504.04.2010 13:55 878 97975104.04.2010 13:55 879 97979104.04.2010 13:55 882 97990304.04.2010 13:55 2.001 97991704.04.2010 13:55 3.097 98007704.04.2010 13:55 871 98012004.04.2010 13:55 2.007 98013804.04.2010 13:55 1.684 98022604.04.2010 13:55 2.010 98025104.04.2010 13:55 583 98029504.04.2010 13:55 1.182 98033304.04.2010 13:55 303 98035804.04.2010 13:55 1.668 98036804.04.2010 13:55 876 98039604.04.2010 13:55 1.681 98042304.04.2010 13:55 303 98059804.04.2010 13:55 875 98068104.04.2010 13:55 302 98085604.04.2010 13:55 869 98092204.04.2010 13:55 2.656 98095104.04.2010 13:55 587 98095904.04.2010 13:55 996 98100204.04.2010 13:55 1.336 98110704.04.2010 13:55 583 98111204.04.2010 13:55 999 98111904.04.2010 13:55 16.297 98112804.04.2010 13:55 585 98112904.04.2010 13:55 585 98113004.04.2010 13:55 1.394 98119404.04.2010 13:55 583 98121404.04.2010 13:55 1.992 98128604.04.2010 13:55 1.994 98130304.04.2010 13:55 304 98161804.04.2010 13:55 587 98181304.04.2010 13:55 881 981877 293 File(s)here is everything avilable in MS database "on request" from 970000-983000 regarding Windows 7. Before you ask, ye some updates are duplicated and things like that... Edited April 4, 2010 by SoLoR Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.