Difference between revisions of "Turkish and Kyrgyz/Making a corpus from azattyk"

From Apertium
Jump to navigation Jump to search
(Created page with '# Get from ** http://www.azattyk.org/archive/[1]/20100101/[2]/[2].html **{|class="wikitable" |+ possible values for [1] and [2] |- ! [1] ! [2] |- | ky-kyrgyzstan || 392 |- | ky-c…')
 
Line 1: Line 1:
 
# Get from
 
# Get from
** http://www.azattyk.org/archive/[1]/20100101/[2]/[2].html
+
#* http://www.azattyk.org/archive/[1]/[date]/[2]/[2].html
  +
#* [1] and [2] (see table below)
**{|class="wikitable"
 
  +
#* [date] is in format yyyymmdd
 
# find each «li class="date archive_listrow_date"»[3]«/li>>; date is [3]
 
# all «li»[4]«/li» between above «li class="date arhive_listrow_date"»«/li» and next «li class="date arhive_listrow_date"»«/li» is an article
 
# each «li»[4]«/li» contains an «a href="[5]"»[6]«/a»; [5] is relative url of article, [6] is title of article
  +
  +
 
{|class="wikitable"
 
|+ possible values for [1] and [2]
 
|+ possible values for [1] and [2]
 
|-
 
|-
Line 25: Line 32:
 
| ky-sport || 400
 
| ky-sport || 400
 
|}
 
|}
# find each <li class="date archive_listrow_date">[3]</li>; date is [3]
 
# all <li>[4]</li> between above <li class="date arhive_listrow_date"></li> and next <li class="date arhive_listrow_date"></li> is an article
 
# each <li>[4]</li> contains an <a href="[5]">[6]</a>; [5] is relative url of article, [6] is title of article
 

Revision as of 08:44, 3 October 2011

  1. Get from
  2. find each «li class="date archive_listrow_date"»[3]«/li>>; date is [3]
  3. all «li»[4]«/li» between above «li class="date arhive_listrow_date"»«/li» and next «li class="date arhive_listrow_date"»«/li» is an article
  4. each «li»[4]«/li» contains an «a href="[5]"»[6]«/a»; [5] is relative url of article, [6] is title of article


possible values for [1] and [2]
[1] [2]
ky-kyrgyzstan 392
ky-central_asia 393
ky-world 394
ky-politics 395
ky-human_rights 396
ky-economy 397
ky-culture 398
ky-voice_of_people 399
ky-sport 400