Binary serialization – conversion of an object into a sequence of binary bytes

A special type of binary serialization is the conversion of an object into a sequence of bytes.

This enables the conversion of an object into a byte stream in order to save the object, transfer it to the working memory, a database or a file.

Serialization of data in computer science

Serialization of data in computer science

Data serialization is therefore a concept and a
design pattern
for programming languages.

How does the serialization of data work?

For serialization, an object is transferred to a stream of data that contains the data.

The data stream can also contain information on the object type, e.g. version, data type and assembly name.

From this data flow, the object can be saved in a database, a file or a working memory.

What is the purpose of serializing data?

The main purpose of serialization is to save the state of an object so that it can be recreated if necessary.

As a developer, you can therefore save the status of an object at any time and recreate the object as required and perform the following actions:

  • Sending the object via remote connection, e.g. via web service
  • Transferringobjects from one domain to another
  • an object protected via a firewall as
    JSON
    – or XML string
  • Manage security and user-specific information about applications
Why binary serialization opens up security risks

Why binary serialization opens up security risks

Binary serialization and XML serialization

Even if a programming language contains classes for binary serialization or XML serialization, these should be used with extreme caution!

Warning

Warning on the use of serialization in unprotected areas

Why binary serialization opens up security risks

Binary serialization uses binary coding for the memory or for socket-based network streams to generate a compact serialization.

With binary serialization, all members of the object – including those with read-only attributes – are serialized. Serialization therefore always improves the performance of a stream, BUT the access and rights management implemented in an application is overridden.

Security aspects of binary serialization

Binary serialization, in which data structures are converted into a binary format, entails specific security risks that must be carefully considered.

Risks of binary serialization

  1. Unauthorized code execution: If an attacker can serialize manipulated objects that are then deserialized, this could lead to the execution of malicious code.
  2. Data leaks: Sensitive information could be inadvertently included in the serialized data and then disclosed.
  3. Replay attacks: Without adequate security measures, serialized data could be intercepted and reused to cause unauthorized access or actions.
  4. Injections: Poorly validated input could lead to injection attacks in which unwanted data is fed into the deserialization process.

Best practices for protection:

  1. Input validation: Ensure that all incoming data is validated before deserialization to detect and prevent malicious data.
  2. Minimize usage: Limit the use of binary serialization to trusted data sources and avoid it where possible.
  3. Use secure libraries: Choose libraries and tools that are known for their security measures and keep them up to date.
  4. Security audits and tests: Conduct regular security audits and penetration tests to identify and fix vulnerabilities.
  5. Object-level access control: Implement access controls that define which objects may and may not be deserialized.

Tools and libraries for serialization

Different programming languages offer different tools and libraries for serialization. Here are some common examples:

Python:

  • Pickle: A built-in library for the serialization and deserialization of Python objects. Attention: pickle is not secure against erroneous or malicious data and should therefore not be used for data that originates from an unknown source.
  • JSON: The json library is used for working with JSON data, a text-based and language-independent representation of objects.

Java:

  • Java Serialization API: Enables the serialization of objects into a byte stream, which can then be transferred to files, databases or via networks.
  • Jackson and Gson: Popular libraries for working with JSON, offering both serialization and deserialization.

JavaScript:

  • JSON.stringify/JSON.parse: Native methods in JavaScript for serializing and deserializing objects to and from JSON.
  • BSON: A binary representation of JSON-like documents, often used in conjunction with MongoDB.

While serialization is a powerful tool in software development, it poses significant security risks if handled improperly, especially in binary serialization.

A deep understanding of the risks and the implementation of best practices is crucial to mitigate them. At the same time, choosing the right tools and libraries for serialization in your specific programming language is crucial to ensure both the efficiency and security of your applications.

What you should always bear in mind when serializing XML

In XML serialization, public data fields, properties of an object, object parameters and return values of methods should be serialized in an XML stream that corresponds to the official XML schema definition.

XML serialization leads to strongly typed classes with public properties and fields that are converted to XML.

For binary or XML serialization you need:

  1. the object to be serialized
  2. a stream with the serialized object
  3. a system-specific runtime instance for formatting the object

Selectively mark attributes in objects as non-serializable to prevent sensitive data from being serialized.

Even if a field of a serializable type contains a pointer, a handle or another data structure and is therefore specific to a certain environment, so that this field cannot be meaningfully restored in any other environment, you should mark the type as non-serializable.

Here are detailed examples of the implementation of serialization in various programming languages:

 

Python – JSON serialization:

import json

# A dictionary in Python
data =

# Serialization: Converting the dictionary into a JSON string
json_string = json.dumps(data)

# Outputs the JSON string
print(json_string)

# Deserialization: Converting the JSON string back into a Python dictionary
parsed_data = json.loads(json_string)

# Outputs the restored dictionary
print(parsed_data)

Java – Java object serialization

import java.io.*;

public class User implements Serializable

JavaScript – JSON serialization

// Ein Objekt in JavaScript
let data = ;

// Serialization: Converting the object into a JSON string
let jsonString = JSON.stringify(data);

console.log(jsonString); // Gibt den JSON-String aus

// Deserialization: Converting the JSON string back into an object
let parsedData = JSON.parse(jsonString);

// Outputs the restored object
console.log(parsedData);

These code examples clearly show you how serialization is implemented in Python, Java and JavaScript.

In Python and JavaScript, JSON is usually used for serialization, as it is natively supported and the data can be easily transferred between different systems.

Java provides a built-in facility for serializing objects, which is particularly useful when objects need to be sent over a network or stored in files.

In any case, it is important that you consider the security aspects of serialization, especially if you are working with external data.