Parsing big JSON files with StreamReader in C#

Using Newtonsoft.Json, how to read a big JSON file via file stream to keep the memory footprint lower.

I will use the good C# JSON library Newtonsoft.Json.

Note: in this case the format of the JSON is the result of MultipleModelAPIView of DjangoRestMultipleModels an extension of django-rest-framework.

The JSON structure is the following

With the corresponding C# classes

We start by calling the parser class DinamicStreamJsonParser; the PropertyToType Dictionary matches the string definition found in the JSON with the corresponding C# class.

Every time the parser find a new list of objects it calls the NewTypeFoundCallback delegate, and every time it finds a new object (a new instance of our object in the list) it calls ObjectFoundCallback delegate.

We pass the jsonPath string variable in the StreamReader.

And here is the implementation.

using System;
using System.Collections.Generic;
using System.IO;
using Newtonsoft.Json;

namespace MyApp
{
    public class DinamicStreamJsonParser
    {
        public delegate void ObjectFound(object obj);
        public delegate void ObjectFoundWithName(object obj, string name);

        public StreamReader StreamReader;
        public Dictionary<string, Type> PropertyToType;
        public ObjectFound ObjectFoundCallback;
        public ObjectFoundWithName NewTypeFoundCallback = null;

        public void Parse(){
            JsonTextReader reader = new JsonTextReader(StreamReader);
            reader.SupportMultipleContent = true;

            var serializer = new JsonSerializer();

            string currentFound = "";
            string oldFound = "";
            int startObject = 0;

            while (reader.Read()){

                if (reader.TokenType == JsonToken.StartObject)
                {
                    startObject++;
                }
                if (reader.TokenType == JsonToken.EndObject)
                {
                    startObject--;
                }
                if (startObject == 1)
                {
                    if (reader.TokenType == JsonToken.PropertyName &&
                        PropertyToType.ContainsKey((string)reader.Value)) currentFound = (string)reader.Value;
                }

                if (startObject == 2 && reader.TokenType == JsonToken.StartObject)
                {

                    if (PropertyToType.ContainsKey(currentFound))
                    {

                        var type = PropertyToType[currentFound];

                        var o = typeof(DinamicStreamJsonParser)
                            .GetMethod("Deserialize")
                            .MakeGenericMethod(type)
                            .Invoke(this, new object[] { reader, serializer });

                        if (NewTypeFoundCallback != null && oldFound != currentFound)
                            NewTypeFoundCallback(o, currentFound);

                        ObjectFoundCallback(o);

                        oldFound = currentFound;

                    }

                    startObject--;
                }

            }

        }

        public T Deserialize<T>(JsonTextReader reader, JsonSerializer serializer)
        {
            return serializer.Deserialize<T>(reader);
        }

    }
}

It’s pretty straightforward, it uses counters to keep track of the found chars (if we just encountered an object or a list), and calls the delegates when needed.


Get from our Dictionary the current type of the found object

Get the generic method of our class, setting the return type with the type from the dictionary

And invoke the method