Dabbling with structures and unions

Photo Credit: Unsplash

There are lot more to structures and unions than meets the eye. So, I thought why not write an article on them. I'm assuming that you're not a beginner in programming in C.

Can you speak in C?

We know that to store or manipulate data in the computer's memory we have to do it through some computer program. We first need to instruct the compiler or the interpreter to allocate space for the data in the memory. That location in the memory has to be given a name to make retrieval and usage of data easy. That's what we call a variable. Variable is that named location in the memory where we have stored some data. The data can be a integer number, floating-point number, character or boolean. Data types classify what type of data the programmer intends to use.

There are two kinds of data types in C:-
  • Primitive data types: They are provided by the programming language such as char, int, bool, float, double, etc. as basic building blocks.
  • User-defined data types: They are the ones waiting for you to create them. Such as arrays, pointers, structures, unions and enums.

We use variables to store a value, and arrays to store multiple values of same data type. But what if we want something that could store items of different data types? C language has features like structures and unions that allows us to group items of different data types into a single user-defined type.

Suppose, you've to write a program to take the details of several books as input and print the details as output. The details of a book means its book ID, published date, title, author's name, price and its type, i.e. paperback or hardcover. Also you've to calculate the delivery charges required to ship a book and then print its final price. If the book is a paperback and priced below Rs. 1000 then charge Rs. 40 for delivery, or if it's a hardcover and priced below Rs. 1500 then charge Rs. 50 for delivery.



Below is the code:

Structures are always stored in contiguous memory location. Being a block of contiguous memory, each field within a structure is located at a certain fixed offset from the start. The compiler allocates unique storage area of location in the memory for each of the members/fields of a structure and therefore altering the value of any member won't affect the value of other members.
In case of a 32-bit compiler, how many bytes does each member take?
Title = 50 bytes
PublishedDate = 11 bytes
AuthorName = 30 bytes
BookID = 4 bytes
Price = 8 bytes
Type = 4 bytes
Total Bytes = 107 bytes
But sizeof(Book) says that the structure took 112 bytes.
Our assumption about the structure size was 107 bytes but the actual size is 112 bytes. The size of a structure is greater than or equal to the sum of sizes of its members. The reason for the structure size being greater is that the compiler inserts one or more empty bytes between memory addresses in order to align the data in memory. This concept is called structure padding. It's very useful to increase the processor speed.



The way we can have pointers to an int or a double, similarly, we can have pointers to a structure.
#include<stdio.h>
int main(void){
    typedef struct{
        int length, width;
    }Line;
    Line L, *Lp;
    L.length = 12;
    Lp = &L; //pointer Lp points to L
    Lp->width = 2;
    printf("Length: %d, Width: %d\n", L.length, Lp->width);
    /** We don't use dot operator(.) rather we use member access 
    * operator(->) for pointers to access structure members. **/
}
/** Prints
* Length: 12, Width: 2
**/
Lp was declared after the structure itself has been declared completely. But we can also declare pointers to structures before the structure is even finished declaring that is because C allows us to declare pointers to incomplete types. That type of pointer is called a self-referencing pointer(data type). It's an example of recursive data type.
//A Linked List Node
typedef struct Node{
    int data;
    struct Node *next;
}Node;
This structure contains pointer members that can point to different objects of the same type. But we cannot self reference any other type of members. C won't allow us to do it.
typedef struct Book{
    int price;
    struct Book weight; //prompt error
}Book;
We can nest one structure within another structure.
#include<stdio.h>
#include<string.h>
int main(void){
 typedef struct{
  char Name[30];
  struct Address{
   char City[20];
   int Pin;
  }A;
 }Person;
 Person P;
 strcpy(P.Name, "Xyz Ijk");
 strcpy(P.A.City, "Jamaica");
 P.A.Pin = 569235;
 printf("Name: %s, City: %s, Pin: %d\n", P.Name, P.A.City, P.A.Pin);
}
/** Prints
* Name: Xyz Ijk, City: Jamaica, Pin: 569235
**/



Like structures, unions can group items of different data types. But there are significant differences between them. Unlike structures, compiler allocates same memory location for all the members of a union and they all share it. Therefore, altering the value of any of the members will alter other member values too. Hence, the recently updated member is the only member that you can work with. The size of a union is equal to or greater than the size of its largest member.
#include<stdio.h>
#include<string.h>
int main(void){
 typedef union{
  int A, B;
  char C[10];
 }UnionD;
 printf("Size of the union: %ld\n", sizeof(UnionD));
 UnionD P; //union variable declared
 P.A = 20; //Value of B and C affected
 printf("A = %d, B = %d, C = %s\n", P.A, P.B, P.C);
 P.B = 10; //Value of A and C affected
 printf("A = %d, B = %d, C = %s", P.A, P.B, P.C);
 strcpy(P.C, "Godfather"); //Value of A and B affected
 printf("A = %d, B = %d, C = %s\n", P.A, P.B, P.C);
}
/** Prints
* Size of the union: 12
* A = 20, B = 20, C =  
* A = 10, B = 10, C = 
* A = 1717858119, B = 1717858119, C = Godfather
**/
Just as one structure can be nested within another structure, we can do the same with unions. We can also put unions in structures and vice versa. Pointers to unions are also allowed to be declared. We use unions when we need to work with anyone of the members at a time. But, in my opinion, try not to use unions unless you are working in a memory constrained environment. As its members share memory, ambiguity is more.

We can design complex abstract data types and data structures using both structures and unions.

If you've found any errors or want to give a feedback then comment below.

References:

Comments