Turkish and Kyrgyz/Making a corpus from azattyk
< Turkish and Kyrgyz
Jump to navigation
Jump to search
Revision as of 08:44, 3 October 2011 by Firespeaker (talk | contribs)
- Get from
- http://www.azattyk.org/archive/[1]/[date]/[2]/[2].html
- [1] and [2] (see table below)
- [date] is in format yyyymmdd
- find each «li class="date archive_listrow_date"»[3]«/li>>; date is [3]
- all «li»[4]«/li» between above «li class="date arhive_listrow_date"»«/li» and next «li class="date arhive_listrow_date"»«/li» is an article
- each «li»[4]«/li» contains an «a href="[5]"»[6]«/a»; [5] is relative url of article, [6] is title of article
[1] | [2] |
---|---|
ky-kyrgyzstan | 392 |
ky-central_asia | 393 |
ky-world | 394 |
ky-politics | 395 |
ky-human_rights | 396 |
ky-economy | 397 |
ky-culture | 398 |
ky-voice_of_people | 399 |
ky-sport | 400 |