Changes

Jump to navigation Jump to search
8,612 bytes removed ,  14:37, 8 June 2020
m
no edit summary
* Your partner(s) may not be used to TeX (ConTeXt) markup.
* Marking up editorial comments is difficult (for instance in WYSIWIG softwaresuch softwaresuch as WORD, edits can be highlighted by switching on "Track changes")
* One needs to have the same document in different formats (PDF, HTML, DOC etc)
There are several different ways to address each of these issues. For instance
* Have the partner construct the document in WYSIWIG software and have themimport it to OpenOffice and then export it to TeX.
* Have the partner edit the source code
== Translating HTML into ConTeXt using Ruby ==
The next step is * [[Navigating to retrieve the HTML pages created in the step above. Here I have used the ruby library 'open-uri' topage]]* [[Setting up ConTeXt document]]retrieve the web-page * [[Click and another libray 'hpricot' navigate to edit these pages chapters and translate html sections]]* [[HTML to ConTeXt]]** [[Removing unwanted markup into ConTeXt ]]** [[Simple replacements]]** [[Translating Figure markup]]** [[Translating Table markup. ]]** [[The rest of the filters]]
=== Step 1. Open the remote page ===<pre> #scan_page.rb = Retrieves the html page of interest from the server, # navigates to links within the main page and construct a # context document #!/usr/bin/ruby require 'rubygems' require 'open-uri' # the open-uri library require 'hpricot' # the hpricot library require 'scrape_page' # user-defined function to filter html into ConTeXt # scans the home page and lists # all the directories and subdirectories doc=Hpricot(open("http://ipa.dd.re.ss/AnnRep07")) </pre> === Step 2. Setting up the ConTeXt document ===<pre> mainfil="annrep.tex" # open a file to output ConTeXt document `rm #{mainfil}` fil=File.new(mainfil,"a")  # Add some opening directives and include style files fil.write "\\input context_styles \n" # this file contains the styling options for my Context document fil.write "\\starttext \n" fil.write "\\leftaligned{\\BigFontOne Contents} \n" fil.write "\\vfill \n" fil.write "{ \\switchtobodyfont[10pt] " fil.write "\\startcolumns[n=2,balance=no,rule=off,option=background,frame=off,background=color,backgroundcolor=blue:1] \n" fil.write "\\placecontent \n" fil.write "\\stopcolumns \n" fil.write "}"  </pre> === Step 3. Clicking chapters and section links === In this example, we created new pages for chapters and sections so that each part of the document couldbe authored by a different person. In Informl new pages are indicated by the CSS class name "existingWikiWord"as shown in the following figure.  [[Image:Wiki_prev2.jpg]]. <pre> <p> <a class="existingWikiWord" href="http://localhostCategory:3010/AnnRep07/pages/APCC+Research+and+Development+Projects"> APCC Research and Development Projects </a></p>  </pre> Knowing this, I have used the following 'hpricot' code to click on chapter and section links to retrievetheir contents. <pre>chapters= (doc/"p/a.existingWikiWord") # we need to navigate one more level into the web page # let us discover the links for that chapters.each do |ch| chap_link = ch.attributes['href'Old_Content] # using inner_html we can create subdirectories chap_name = ch.inner_html.gsub(/\s*/,"") chap_name_org = ch.inner_html # We create chapter directories system("mkdir -p #{chap_name}") fil.write "\\input #{chap_name} \n" chapFil="#{chap_name}.tex" `rm #{chapFil}` cFil=File.new(chapFil,"a") cFil.write "\\chapter{ #{chap_name_org} } \n" </pre> <pre> # We navigate to sections now doc2=Hpricot(open(chap_link)) sections= (doc2/"p/a.existingWikiWord") sections.each do |sc| sec_link = sc.attributes['href'] sec_name = sc.inner_html.gsub(/\s*/,"")   secFil="#{chap_name}/#{sec_name}.tex" `rm #{secFil}` sFil=File.new(secFil,"a") sechFil="#{chap_name}/#{sec_name}.html" `rm #{sechFil}` shFil=File.new(sechFil,"a") </pre> After navigating to sections (h1 elements in HTML) retrieve their contentsand send it to the ruby function "scrape_page.rb" for filtering. <pre> # scrape_the_page(sec_link,"#{chap_name}/#{sec_name}") scrape_the_page(sec_link,sFil,shFil) cFil.write "\\input #{chap_name}/#{sec_name} \n" end end fil.write "\\stoptext \n" </pre> === Filtering HTML into ConTeXt ===

Navigation menu