Difference between revisions of "Click and navigate to chapters and sections"
Jump to navigation
Jump to search
m |
m |
||
Line 70: | Line 70: | ||
fil.write "\\stoptext \n" | fil.write "\\stoptext \n" | ||
</pre> | </pre> | ||
+ | |||
+ | [[Category:Old_Content]] |
Latest revision as of 13:53, 8 June 2020
<< HTML_and_ConTeXt | HTML_to_ConTeXt >
In this example, we created new pages for chapters and sections so that each part of the document could
be authored by a different person. In Informl new pages are indicated by the CSS class name "existingWikiWord"
as shown in the following figure.
<p> <a class="existingWikiWord" href="http://localhost:3010/AnnRep07/pages/APCC+Research+and+Development+Projects"> APCC Research and Development Projects </a> </p>
Knowing this, I have used the following 'hpricot' code to click on chapter and section links to retrieve their contents.
chapters= (doc/"p/a.existingWikiWord") # we need to navigate one more level into the web page # let us discover the links for that chapters.each do |ch| chap_link = ch.attributes['href'] # using inner_html we can create subdirectories chap_name = ch.inner_html.gsub(/\s*/,"") chap_name_org = ch.inner_html # We create chapter directories system("mkdir -p #{chap_name}") fil.write "\\input #{chap_name} \n" chapFil="#{chap_name}.tex" `rm #{chapFil}` cFil=File.new(chapFil,"a") cFil.write "\\chapter{ #{chap_name_org} } \n"
# We navigate to sections now doc2=Hpricot(open(chap_link)) sections= (doc2/"p/a.existingWikiWord") sections.each do |sc| sec_link = sc.attributes['href'] sec_name = sc.inner_html.gsub(/\s*/,"") secFil="#{chap_name}/#{sec_name}.tex" `rm #{secFil}` sFil=File.new(secFil,"a") sechFil="#{chap_name}/#{sec_name}.html" `rm #{sechFil}` shFil=File.new(sechFil,"a")
After navigating to sections (h1 elements in HTML) retrieve their contents and send it to the ruby function "scrape_page.rb" for filtering.
# scrape_the_page(sec_link,"#{chap_name}/#{sec_name}") scrape_the_page(sec_link,sFil,shFil) cFil.write "\\input #{chap_name}/#{sec_name} \n" end end fil.write "\\stoptext \n"