Thursday, May 24, 2018

Combining Two PDFs Using .Net Core and a Free Library

I recently wrote a four-part series of posts describing how to display SSRS reports inside an Angular application. Once we delivered the requested features, there were a few more stories that were added, among them was the ability to attach multiple PDFs to each other before displaying them to the user. Since we already had everything in place to retrieve and display a single PDF at a time, the (obviously) tricky part of this task was to figure out how to merge multiple PDFs into a single document on the fly.

I think my first step was probably the same as most other developers: I Googled it. I kept coming across iTextSharp as the most common solution others had used so I dug into it and it looked like it was perfect. Unfortunately, it wasn't free for our scenario (the license allows certain free usages, but we didn't fall under any of them) so it wasn't really an option. I found a couple others way to do what I needed, but none were as good or as fast. I ultimately came across a Stack Overflow answer where someone mentioned a port to .Net Core (did I mention this was in .Net Core 2.0?) of the last version of iTextSharp that was under the LGPL (read: essentially free) license. Perfect! All I had to do was figure out how to make it work, which lead me to this post.

The NuGet package is iTextSharp.LGPLv2.Core (there is also iTextSharp.LGPLv2.Core.Fix, but I'm not sure what the difference(s) is/are). Once I had that installed it was a simple matter of writing a method to merge multiple PDFs together. Of course I wanted the method to be reusable and injectable so I created a basic interface with a single method that accepts an array (of undetermined length using the params keyword) of byte arrays representing the PDFs to merge together.
   1:  public interface IProvidePdfMerging
   2:  {
   3:      byte[] Merge(params byte[][] originals);
   4:  }

And then I created the implementing method.
   1:  public class PdfMerger : IProvidePdfMerging
   2:  {
   3:      public byte[] Merge(params byte[][] originals);
   4:      {
   5:          var files = originals.ToList();
   6:  
   7:          using (var stream = new MemoryStream())
   8:          {
   9:              var doc = new Document();
  10:              var pdf = new PdfCopy(doc, stream);
  11:              doc.Open();
  12:  
  13:              PdfReader reader;
  14:              PdfImportedPage page;
  15:  
  16:              files.ForEach(file =>
  17:              {
  18:                  reader = new PdfReader(file);
  19:                  for (var i = 0; i < reader.NumberOfPages; i++)
  20:                  {
  21:                      page = pdf.GetImportedPage(reader, i + 1);
  22:                      pdf.AddPage(page);
  23:                  }
  24:  
  25:                  pdf.FreeReader(reader);
  26:                  reader.Close();
  27:              });
  28:  
  29:              doc.Close();
  30:  
  31:              return stream.ToArray();
  32:          }
  33:      }
  34:  }

And that's pretty much it. I pass in the byte arrays representing the PDFs I want to merge, in the order I want to merge them, and then the resulting byte array is my new PDF. It's clean, it's fast, it's reusable, it's injectable. I covered all the bases pretty easily here.

No comments:

Post a Comment