News Conversion

XML Feed

Some newspapers may want to have a feed of all their content to populate an existing website design. As an alternative to the hassle of cut and paste or low quality/high effort feeds.

We accept the PDF files used to print the paper and create an XML file that contains the elements: PaperIndex, Filename, Category, Headline, Subhead, Byline, and HTMLcontent.

Category values can be specified by the newspaper and usually are the various sections for the website. PaperIndex is the name of the paper and Filename are used for identification of each story. Headline, Subhead, and Byline elements have standard newspaper industry meaning. The CDATA HTMLcontent element contains the complete story in HTML format including all pictures and any captions.

Key features of the PDF conversion often missing in competitors' PDF to HTML conversion packages include:

  • Paragraph detection with proper placement of <P> tag without any random returns in the middle of paragraphs;
  • Proper handling of bold, italic and underline;
  • Proper handling of drop caps and drop shadows;
  • Removal of random spaces in words
  • Addition of spaces in random words that run together
  • Proper converion of tables into <TABLE> tag

Per page pricing

Page SizePrice per page
11 x 1795 cents
11 x 22$1.23
17 x 22$1.89
Minimum fees per issue apply.
Learn More.

Products & Services

News conversion

Display advertising solutions

Complete turnkey website

 

Questions? Contact us