Friday, December 16, 2016

ASP.Net C# Convert HTML to PDF using iTextSharp (update)

Here's a minor update to a post from last year how how to convert HTML to PDF using iTextSharp.

I wanted to add a way to optionally render the PDF in landscape mode, so this was what I ultimately came up with.

 public static ReturnValue ConvertHtmlToPdfAsBytes(string HtmlData, bool Landscape = false)
{
    // variables
    ReturnValue Result = new ReturnValue();

    // do some additional cleansing to handle some scenarios that are out of control with the html data
    HtmlData = HtmlData.ReplaceValue("<br>", "<br />");

    // convert html to pdf
    try
    {
        // create a stream that we can write to, in this case a MemoryStream
        using (var stream = new MemoryStream())
        {
            // create an iTextSharp Document which is an abstraction of a PDF but **NOT** a PDF
            using (var document = new Document())
            {
                // portrait vs landscape
                if (Landscape)
                {
                    document.SetPageSize(PageSize.A4.Rotate());
                }

                // create a writer that's bound to our PDF abstraction and our stream
                using (var writer = PdfWriter.GetInstance(document, stream))
                {
                    // open the document for writing
                    document.Open();

                    // read html data to StringReader
                    using (var html = new StringReader(HtmlData))
                    {
                        XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, html);
                    }

                    // close document
                    document.Close();
                }
            }

            // get bytes from stream
            Result.Data = stream.ToArray();

            // success
            Result.Success = true;
        }
    }
    catch (Exception ex)
    {
        Result.Success = false;
        Result.Message = ex.Message;
    }

    // return
    return Result;
}
The ReturnValue class was simply a helper class that looks like this:

 // return value class  
 public class ReturnValue  
 {  
       // constructor  
       public ReturnValue()  
       {  
         this.Success = false;  
         this.Message = string.Empty;  
       }  
   
       // properties  
       public bool Success = false;  
       public string Message = string.Empty;  
       public Byte[] Data = null;  
 }  
We also had another method to physically create the PDF file in case you didn't want just the bytes array directly, for example:

 public static ReturnValue ConvertHtmlToPdfAsFile(string FilePath, string HtmlData)  
 {  
       // variables  
       ReturnValue Result = new ReturnValue();  
   
       try  
       {  
         // convert html to pdf and get bytes array  
         Result = ConvertHtmlToPdfAsBytes(HtmlData: HtmlData);  
   
         // check for errors  
         if (!Result.Success)  
         {  
           return Result;  
         }  
   
         // create file  
         File.WriteAllBytes(path: FilePath, bytes: Result.Data);  
   
         // result  
         Result.Success = true;  
       }  
       catch(Exception ex)  
       {  
         Result.Success = false;  
         Result.Message = ex.Message;  
       }  
   
       // return  
       return Result;  
 }  
It's important to remember that in order for this to work, you must have valid well-formed HTML; otherwise you can certainly expect for iTextSharp to throw an error. But if you have control over the HTML that you need to convert, this solution is great, and produces very nice PDF files.

It's worth noting that in our case we didn't need to pass the CSS in separately using the overloaded ParseXHtml constructor, ParseXHtml(PdfWriter writer, Document doc, Stream inp, Stream inCssFile), because we were including our CSS styles in our HTML data string instead, which for our solution was a bit cleaner.

Matt Pavey is a Microsoft Certified software developer who specializes in ASP.Net, VB.Net, C#, AJAX, LINQ, XML, XSL, Web Services, SQL, jQuery, and more. Follow on Twitter @matthewpavey

0 comments: