C# Types and Literals
Type Concepts, C# Pre-Defined Types & Literals
C# is statically typed. This means types for all values must be known at compile time. Types determine the storage, behaviour and attributes of values, so every value has a type. Literals are simply constant values, and as such must also have a type. Some .NET types are so common, that C# provides synonyms (aliases) for them.
PREREQUISITES — You should already…
- have some programming experience, preferably in C/C++/D/Objective-C/Java/Pascal;
- be conversant with the fundamentals of Object-Oriented Programming (OOP);
- understand the role of types in a statically typed language;
- understand expressions, operators and precedence in general.
CONVENTIONS
To keep code short, we may not always provide the full source for a syntactically complete file. Code snippets will always assume that the following preamble is at the top of the source file:
using System;
using System.Linq;
using System.Collections.Generic;
using static System.Console;
Executable statements will assume they are within some function body block, like Main
. If only the definition of a function is provided, you must ensure that you use it inside a class
. All code is assumed to be in the same namespace unless explicitly specified.
In this document, wherever we refer to any type in general it will notated as: type — you can just replace that with any known type. When we use ref-type, it must be constrained to reference types, while val-type must be limited to value types.
We use ident as placeholder for identifier, in other words, names of some language elements. The following are also identifiers: property, function, variable.
For access control on members of a class, you can use:
private
— only code in the class has permission to access it;
public
— all code in the program has permission to access it;
protected
— only code in the class and code in derived classes have permission;
internal
— all code in the assembly has permission to access it; or
internal protected
— like internal
, but derived classes also have permission.
Fundamental Concepts
Programming languages aim to simplify the translation of human processes to machine code; whether that is native machine code that the processor can directly execute, or whether it is in the form of some intermediate language, as in .NET languages, which use CIL/IL (Common Intermediate Language). This translation is sometimes called ‘compilation’.
Overview of Major Concepts
Either way, one of the mechanisms employed by a programming language to ease the burden on the programmer, is the abstract concept of a type. If this concept can be applied to every value, the compiler can determine from it the size of memory required for the object — a task which is the responsibility of assembler programmers; tedious and error-prone. Furthermore, the compiler can ensure that operations on a value are supported by its type.
Purpose of Types
Since the compiler knows the types of all values at compile time, it can create operations optimised for the types involved. Addition of the different numeric types involves different machine instructions (or intermediate language instructions). If the types are not known, the compiler must dynamically (at runtime) first check the type, then jump to the appropriate functionality. This is not efficient at all.
Ultimately then, the purpose of abstracting the concept of a type is twofold:
reduce burden — Ease the programming process. If you have not had the opportunity to program at a low level, you may not appreciate how types alleviate a lot of drudgery, and enhance the integrity of your program.
performance — Improve efficiency. When all else is equal, statically typed languages will always outperform dynamically typed languages.
Ultimately the purpose of types is irrelevant: since types govern everything that occurs in a program, you cannot escape them, so you may as well learn to love them.
Moving/Copying Values
When a data value must move, the type of the value determines how many bytes must be copied. In computer terminology, move means copy. The number of bytes in the destination of the copy, must match the source data size. That will be true if the source and destination types are the same. If they are not the same type, the compiler will sometimes automatically convert (or ‘cast’) the source type to the destination type, but only if the destination type is larger than the source type. Data movement thus involves type conversion rules, and data movement occurs in the following situations:
- initialisation — When new variables are created and, as part of the (optional) syntax, given an initial value.
- assignment — When the assignment operator is used to overwrite an existing value in the destination.
- argument passing — Parameters are local variables. The only difference is that they are initialised by the argument-passing process (during a function call).
- function returns — Logically, every function that does not return
void
, has a temporary ‘return variable’, whose type is determined by the return type of the function as defined. The statement: ‘return
expr;
’, copies the result of expression to this variable (initialises it).
This is a fundamental and consistent concept. We mention it here because each of the above cases is often seen or treated as a separate issue.
Memory Space
A type may represent something simple, like a number. The type will then define the size of the number (how many bytes a value of such a type occupies in memory). Different types of numbers have different characteristics. For example, integer (or integral) values cannot represent decimals (fractions), but real or floating-point type values can. Integer values come in different sizes (number of bytes)… determined by, of course, their types.
In C#, as convenience, you can use the sizeof(
type)
operator to determine the number of bytes some type will occupy in memory. This is seldom necessary, but does illustrate that types determine the space objects occupy. The sizeof(
type)
operator is limited to unmanaged (native / non-CTS) types and a number of specific .NET types.
Sizes of some integer types
public static void Main () {
WriteLine("sizeof(byte) = {0} byte ", sizeof(byte)); //⇒ … = 1 byte
WriteLine("sizeof(int) = {0} bytes", sizeof(int)); //⇒ … = 4 bytes
WriteLine("sizeof(lv) = {0} bytes", sizeof(long)); //⇒ … = 8 bytes
}
Memory management is removed from the programmer's responsibilities. Code depends on the .NET memory manager found in System.GC
(Garbage Collector). Only advanced programs will ever directly interact with the Garbage Collector. In particular, programs will only use the ‘new
type’ operator to allocate memory, and never explicitly release the memory.
TIP — Releasing Memory
A program cannot directly release memory it no longer needs. Since all non-trivial memory will be allocated with new
, and thus be reference types, you can assign null
(a keyword) to the variable holding the reference. This will reduce the number of references to the memory, and if the reference count becomes 0
, the Garbage Collector will automatically reclaim the memory (though not necessarily immediately). This is a good convention for larger objects, and the best you can do in general. You can call GC.Collect
, but there is no guarantee.
Some reference types manage resources like files, printers, devices, network connections, database connections, etc. They will all implement the IDisposable
interface, which in turn means they will all have a Dispose
method. The overloaded Finalize
method will call Dispose
, if your code did not call it, but it really should.
IMPORTANT — Releasing Resources
For all types that implement the IDisposable
interface, it is your responsibility to call the Dispose
method it represents, as soon as your code is done with the resource. If you do not, it will negatively affect performance, and other programs may be blocked from accessing the resource for as long as your program is hanging on to it. Some types may have a Close
method, which will call Dispose
(but it still implements IDisposable
, which is your clue).
Characteristics
The range of values is not the same for all types. This is determined in part by memory space, but even for different integer types which occupy the same amount of memory, the range of values will not necessarily be the same.
Based on the type, you may be able to perform arithmetic on it.
In the .NET Framework, and thus also in C#, types are divided firstly into two categories, which determine a type's characteristics: Reference Types and Value Types.
Integer/integral types have signed and unsigned variants. For example, int
(System.Int32
), is signed, which means it can represent negative and positive values within the range –231⋯ 231–1. On the other hand, uint
(System.UInt32
) is also a 4-byte integer type, but represents only positive values in the range: 0⋯232-1.
TIP — Default Numeric Types
The default integer type in C# is int
. The default floating point type is double
. These should be the two types you use most of the time. The others are all special types, which are used in special situations. As an additional point: you should never mix signed and unsigned integer types in expressions.
Behaviour
Some characteristics determine behaviour, but we use it here to explain a concept: all types will expose a selection of constructors, methods, operators, properties and so forth. These are collectively called members, and if accessible, allow code to interact with an object. Loosely speaking, they provide an interface to the class, but we will never again use the term ‘interface’ in this context, simply to avoid confusion with the C# interface
keyword, and its formal abstraction in the .NET Framework.
Some members are defined with static
, and must be accessed via the class name in all code outside the class. Other members are defined with const
, and they too, must be accessed via the class name.
Operations
The low-level mechanics of arithmetic on different types of numbers (integers, unsigned values, floating point values, currency values), are all different. However, the programmer only has to use the arithmetic operators in a ‘natural way’: a + b * c
, without regard to the low-level code, as long as the types of a
, b
and c
are valid with respect to the operators.
The compiler will therefore, guided by the types of the operator operands, generate the machine code appropriate for the type. When you mix numeric types in arithmetic, the compiler will convert the operands with the smallest ranges to the operand with the highest range.
Most operations on values are performed with operators. This is crucial, and unlike other non-C-like languages. Even assignment is an operator, and not a special kind of ‘assignment statement’. Every operator has rules regarding what the types of its operands may be. Some operators only work with integral values, for example.
Depending on the operator, the result of the operation may not be the same type as any of its operands. For example, all the relational (comparison) operators produce values of type bool
(System.Boolean
), regardless of the types of the expressions you are comparing.
The new
, is
, as
and cast operators, all work with a type name as at least one operand. But other than that, types are not part of expressions. They are, however, involved implicitly at all times, since every value has a type. Every variable has a type. Every result has a type. Every expression, thus, has a type.
Existing types are mostly used for declarations. More specifically, they are used as part of the syntax pattern for defining variables, constants, and methods. These all have names, which are formally called identifiers. And that is how you will ‘work’ with types for the most part, until you know enough C# to create your own types, known as ‘user-defined types’.
Type Inference
The var
keyword can used inside functions/methods as an alternative to an explicit type. Such variables must be explicitly initialised with an expression. The var
keyword is effective replaced with the type of the expression at compile time.
This effect is called type inference — the type of the variable is inferred from the type of the initialising expression.
This can also be used when defining variables in the parentheses of for
and foreach
statements.
It is also often used for variables storing the results of LINQ expressions.
Major Type Categories
As mentioned above, in the .NET Framework and thus in C#, types are divided firstly into two categories, which determine a type's characteristics:
Reference Types / Class Types
This is the most common category, with the most sub-types. These are more commonly referred to as ‘class types’, since they are created with the class
keyword in C#. Reference types always have System.Object
as their ultimate base class (but not necessarily their direct ancestor).
Values of reference types are always only accessed via references, which is an abstraction for ‘addresses’ or ‘pointers’. This is implicit and transparent. Variables and parameters of a reference type thus only ever contain a reference, which points to, or refers to, the actual memory representing the object. A copy of a reference does not copy the object. So there will be space for the variable (space enough to store a reference) and space for the object.
Value Types / Struct Types
This is a small category without many types, but they are crucial from a performance perspective, which is exactly why they exist. All numeric types, for example, are value types, created with struct
in C#, and have System.ValueType
as their immediate base class.
A variable of a value type stores the actual value. There is no further memory in play. Assigning, or passing as argument such a value, makes copies of the actual value.
Specialised Type Categories
Although fundamentally we have class
types (reference types), and struct
types (value types), there are a number of specialised types. Some of these are reference types, and some of them are value types. They are all created with different keywords though, which characterise their specialities.
Fundamental Types
Boolean values, integers, floating point values, currency (decimal
) values, characters and strings are all rather ubiquitous and can be considered ‘fundamental’ — this is not a formal specification. The numeric types are all value types. So is char
(System.Char
), while string
and object
are reference types.
Enumerated Types
Enumerated types are created with the enum
keyword. This is used to create abstract constant values of the same type, which in memory representation, share the same space as integer types; specifically, and by default, int
. They are convertible to/from integral types.
Interface Types
Interface types are reference types created with the interface
keyword. An interface represents a list of methods (including properties and indexers), that a class may choose to ‘implement’. A class can be ‘cast to an interface’, which means that it can be used in a context where any interface it implements, is expected. The cast or type conversion is implicit. By convention, interface type names in the .NET Framework start with 2 capital letters, where the first is always I
.
Delegate Types
To abstract functions as references that can be stored and moved around, the delegate
keyword can be used to create delegate types. Delegates, together with lambdas (anonymous function expression syntax), allow for some very powerful programming techniques. If you have a parameter of a delegate type, for example, you can pass it different delegates at different times to alter some logic of the function. This is called the ‘plugin’ or ‘call-back’ technique. There are many other uses, but passing delegates is common. Function names are expressions, and implicitly convertible to a matching delegate type.
Generic Types
Generic types are similar to template types as found in C++, and serve the same purpose. They are more complicated in implementation, but easier to use in C# compared to C++. They act like algorithms that can be physically copied and adapted for different types. The types, for which they must be specialised, are passed as ‘template arguments’. These generic type names will always end with angle brackets enclosing one or more types.
Using Types in Code
In C# code, type names appear in several places, but are most common in definitions (declarative statements). But there are some operators which require at least one type as operand. Types are not values. They are more analogous to plans or blueprints — a specification of how to construct something, and what features it will have.
In this simple example program we use a number of C# type aliases to define, convert, pass and return values of these various types.
TypeUse.cs
— Type Locations
/*!@file TypeUse.cs
* @brief Examples of Locations where Types are Used.
*/
using System;
using System.Linq;
using System.Collections.Generic;
using static System.Console;
public class AppTypeUse {
public static void Main() { //←`Main` returns `void` type.
int i = 123; //←`i` has type `int`.
double d = 123.456; //←`d` has type `double`.
= (int)d; //←`d`'s value ‘cast to’ `int`.
i = F(123L); //←store `int` returned by `F()`.
i }
int F (long parm) { //←`F` returns `int` & takes one
// `long` as parameter.
return (int)(parm * 0.125); //←`double` result cast to `int`,
} // before returning the result.
}//class
As you can see, almost every non-structural line contains a type — either explicitly named, or implied by the value of an expression. Types are absolutely pervasive and at the core of understanding languages like C#. An expression is never ‘just a value’; instead, it is always ‘a value with a type’.
Expressions
Just to remind you: an expression is any arbitrary value, literal, constant or a combination of them interspersed with operators. If operators are present, they are evaluated in precedence order, taking their association with respect to their operands into consideration. Regardless of complexity, once evaluated, only one value remains: the result of the expression. And that resulting value has a type. We simply shorten this story to: ‘every expression has type’.
Object Allocation
The new
operator requires a type. It allocates space, and calls a constructor for that type to initialise the newly allocated memory.
Syntax:
count
An run-time integer expression; it does not have to be a literal or constant. It is used here as placeholder for the meaning: ‘number of elements’.new
type(
args)
Allocates new object of type. The args may be empty, depending on available constructor overloads. The type may be a value type (only initialises, no memory allocation takes place). If you want to initialise the new object with its parameterless constructor, you may omit the parentheses.new
type[
count]
[{
init-list}
]
Allocates new array. If optional{
init-list}
is present, count may be omitted.new
type[
count][]
[{
init-list}
]
Allocates new array-of-arrays. If optional{
init-list}
is present, size may be omitted.
int i = new int (123); //←not necessary for value types.
int j = 123; //←same effect as above.
string s = "ABC"; //←`new` is implicit.
int[] a = {11, 22, 33}; //←`new` is implicit (shorthand).
= new int[]{44, 55, 66}; //←no shorthand possible here.
a = new string('-', 10); //←explicit `new`. s
The new
operator must be called for all reference types, whether it is explicit, via a shorthand syntax, or via some function that ‘creates’ objects (sometimes called an ‘object factory’).
Local Variable Definitions
To define a variable inside a function, you have to choose (a) a name (identifier) that is not a keyword, and (b) most importantly, a type for the variable. The optional static
modifier will give the variable global lifetime, and it will not lose its value when you return from a function — which is the case for all other local variables.
Syntax — Basic Variable Definitions
static
type ident;
Define an uninitialised variable (when local to a function) called ident. It must be assigned a value later before use.
static
type ident=
expr;
Define a variable ident, and initialise it with an expression, which must have the same type, or be implicitly convertible to type.
static
type ident1=
expr1,
ident2=
expr2,
…;
Define multiple variables of the same type. Any may optionally be initialised.
Examples of local variable definitions and scope
public static void Main() {
int i = 123;
/* compound statement block:
*/ {
int j = 456, k, l;
WriteLine("i={0}, j={1}", i, j);
= 77; l = 88;
k WriteLine("k={0}, l={1}", k, l);
}
}
A static
variable, if not explicitly initialised, will be given the default value for that type; which for numeric types, is zero.
In C#, variables defined at a lower level (nested level), is not allowed to hide variables with the same names defined in a higher local scope. This is different in behaviour when compared to C/C++ rules in this regard.
Fields/Data Member Definitions
The syntax for data members (fields), including readonly
or const
members, requires a type and looks similar to local variable definitions. Because members appear at class level, their definitions may be prefixed with an access specifier. As a good coding convention, you should generally use private
for fields (i.e. only code in the class has access permission).
Syntax — Fields
access [
static
] [readonly
] type ident;
Data members that are notreadonly
orconst
, are generally givenprivate
access. Astatic
field is shared by all objects (only one copy ever exists), and is not that common. Areadonly
field can only be written to (initialised) by a constructor.access [
static
] [readonly
] type ident=
expr;
Same rules as above apply, except for the ‘…=
expr’ part, which is called a ‘field initialiser’.access
const
type ident=
expr;
Define a symbolic constant. The ‘…=
expr’ part is not optional, and must be a constant expression. It is accessed like astatic
field: ‘class.
ident’.
The evaluation of a field initialiser expression is triggered by a constructor call (with new
), and is executed before the body of the relevant constructor.
Class data members/field definition examples
class Foo {
public const int CF = 12; //← ‘symbolic constant’.
public static int SF = 34; //← ‘shared’ field; initialised.
public readonly int RF = 56; //← read-only instance field.
private int IFI = 78; //← ‘field initialisation’.
private int IFU; //← most common field syntax.
Foo () { //← constructor to initialise.
= 55; //← only constructors can write
RF // to `readonly` fields.
= 90; //← initialise instance field.
IFU }
}//class
// in some function:
⋯ var obj = new Foo(); //← allocate and initialise.
Write("{0}\n", Foo.CF); //← access symbolic `const`ant.
Write("{0}\n", Foo.SF); //← access `static` field.
Write("{0}\n", obj.RF); //← access `readonly` field.
Write("{0}\n", obj.IFI); //← ERROR. no access. ☆
Write("{0}\n", obj.IFU); //← ERROR. no access. ☆
☆ If the last two statements appeared in a function inside the class, they would not have been in error. On the other hand, if the members had public
access, the code, as is, would compile.
Properties and Indexers
Properties appear to users of the class as if they are variables, which may or may not be writeable. Indexers are specialised properties with no name, but which allow programmers to use subscripting on objects. Properties or indexers may be static
(shared by all objects). In the syntax below, prop is really just an identifier.
Syntax — Properties & Indexers
access [
static
] type prop{
get{
⋯return
expr; }
retrieves value
set{
⋯}
optional; getsvalue
as automatic parameter.
}
This is the most common type of property. It acts like a member variable: ‘obj.
prop=
expr;
’, as example. Or ifstatic
: ‘class.
prop=
expr’.access [
static
] type prop{
get;
automatically retrieves backing variable.
set;
optional; automatically sets backing variable.
}
This is an automatic property: the compiler automatically creates a backing variable of type, and automatically creates code forset
andget
to access this variable.access [
static
] typethis[
params] {
get{
⋯return
expr; }
gets param as argument(s)
set{
⋯}
gets ‘typevalue
’ and params as parameters.
}
This is syntax for an indexer and allows you to subscript objects of this class. The param can be any type (normally some key or index). You may have more than one parameter, separated by commas.
In all cases, the get
and set
parts have the same access, but can be explicitly specified if a different access is required. It is common to give the set
part private
permissions when the access of the property itself is public
(which means get
remains public
).
The contextual keyword: value
is a parameter with the type of the property, which is automatically created and passed to the set
parts.
Function Parameters
When we design a function to accept expressions as arguments, which we ‘pass’ to the function with the function call operator, we must define the function's parameters. They are, in every respect, variables local to that function, with the same lifetime and the same scope. The only difference is the context and use — parameters are initialised by the caller.
In the example syntax below, the focus is on the parameter syntax (shown in bold). Parameters cannot be defined in isolation, however, and are always part of the syntax for a function definition:
access modifiers ret-type func-name (
parameters) {
⋯ }
param
One parameter in the form: ‘type ident’.// Example function taking one parameter of type `int`. After // a call, `parm` is for all intents and purposes, an initia- // lised *local variable*. // public static void Func (int parm) { // use `parm`. ⋯parm⋯ } // Just the simplest function, always returning `42` (`int`). // public static int Answer () { return 42; } // Somewhere; calling `Foo.Func()` with several arguments, but // regardless of the expression passed, it must have type `int`. // int i = 123; .Func(i); Foo.Func(123); Foo.Func(Foo.Answer()); Foo
param
=
const-expr
An optional parameter, where const-expr is the default constant expression the parameter will be initialised with, if the caller omits to pass an argument for this parameter.public static void Func (int parm = 123) { ⋯ } ⋯int i = 123; // all the calls below have exactly the same effect: Func(); Func(i); Func(123);
param1
,
param2,
…,
paramn
List of params, separated by commas. If any are optional, they must either all have default values, or if only some have default values, they must all be on the right, after parameters without default values.public static void Func (int parm1, double parm2 = 1.23) { ⋯ } ⋯int i = 123; int d = 1.23; // All the calls below, have exactly the same effect: Func(123); Func(i); Func(i, d); Func(i, 1.23); // For interest's sake, it is possible to name arguments, in // which case the order is less important: Func(parm1:123); Func(parm2:1.23, parm1:123);
params
type[]
ident
Special ‘params
’ parameter, which must be an array of type, but can be passed either an array, or a list of expressions as arguments.public static void Func (params int[] parm) { foreach (int i in parm) Write("{0} ", i); WriteLine(); } // all legal calls to `Func()`: ⋯ Func(11, 22, 33, 44); Func(55, 66); Func(new int[] { 77, 88, 99 }; int[] arr = { 11, 22, 33, 44, 55 }; Func(arr);
ref
param
An ‘in/out’ pass-by-reference parameter. This parameter will accept only initialised variables when the function is called.public static void Func (ref int parm) { = parm * 2; //←modifies *caller*'s variable. parm //or: `parm *= 2;` } ⋯ int i = 11; //←must be initialised to pass. Write("i = {0}", i); //⇒ `i = 11` Func(ref i); //←pass reference to `i`. Write("i = {0}", i); //⇒ `i = 22`
out
param
An ‘out-only’ pass-by-reference parameter. It will accept an uninitialised variable passed as argument. The function cannot read from anout
parameter, and it must write something to it.public static void Func (out int parm) { = 123; //←modifies *caller*'s variable. parm } ⋯int i; //←need not be initialised. Func(out i); //←pass reference to `i`. Write("i = {0}", i); //⇒ `i = 123`
A pass-by-reference parameter allows a function to modify the content of a variable passed. Normally, this is not possible, since the default behaviour is pass-by-value, which means a copy of the variable content is passed. The ref
or out
must appear before the argument when the function is called.
Function Returns
Functions must have a return type as part of the syntax, except for a handful of specialised functions (constructors, destructors and overloaded cast operators). A special abstract type called void
is available if a function has nothing useful to return. In the example syntax below, func is an identifier that names the function.
modifiers
Only functions may havevirtual
oroverride
modifiers, but this does not influence the ret-type (return type), which is our focus here.ret-type
Any type, or the special abstract type:void
. This is available to simulate procedures in other languages. In other words, it is for functions that must perform some task(s), but have nothing useful to return.access modifiers [
static
] ret-type func(
param(s)) {
[local-var(s)]
[statement(s)]
return
expr;
}
Only functions returningvoid
are not required to have a ‘return
expr;
’ statement, but they may optionally have ‘return;
’ statements. Although not encouraged, functions may have multiplereturn
statements.
Type Conversion / Type Cast
A common synonym for type conversion is ‘cast’ or ‘type cast’. That explains the name of the cast operator; it is a unary prefix operator, and one of the few that requires a type operand.
Implicit & Explicit Conversions
Type conversions can be implicit (automatic), or explicit (use of cast operator). Implicit casts are only performed when it is safe, like converting a smaller numerical type to a larger one, e.g.: converting an int
to a long
, or a byte
to a double
. This is sometimes called a promotion (going from a smaller to a larger type).
The reverse (called demotion) is allowed on all numeric types, but the result may not be the original value (as it may not fit in the smaller destination). This effect is called ‘truncation’ (which means ‟to shorten by cutting off”), and does not involve much intelligence. Converting a floating point value explicitly to an integral type, will simply truncate the decimal part (no rounding takes place, in other words). Use Math.Ceiling
, Math.Floor
, or Math.Round
, for alternative behaviours.
Up Casts & Down Casts
For references types, converting to a base type from a derived type, is called an ‘up cast’ or ‘narrowing conversion’, and is implicit. For value types, this is true only when casting to object
, in which case it is called ‘boxing’ and discussed below.
Converting a base class down the inheritance hierarchy (towards derived classes), is sometimes called a ‘down cast’, or a ‘widening conversion’. This is not guaranteed to succeed, and must hence be performed explicitly. If it fails, the cast operator will throw an exception. To avoid an exception, you can use the as
operator: it will simply return null
on failure instead. The as
operator can be used to check if a cast will succeed, before attempting a down cast.
Custom Conversion Operators
Classes may overload operators, including cast operators. They have a slightly different syntax compared to other overloadable operators. The implicit
and explicit
keywords control whether the conversion can be automatically applied by the compiler, or whether the conversion can only be performed explicitly by employing the cast operator.
Custom conversion operators example
class CTYPE {
private double data_;
public CTYPE (double data = 0.0) { data_ = data; }
public static implicit operator double (CTYPE parm) {
return parm.data_;
}
public static explicit operator CTYPE (double parm) {
return parm.data_;
}
}
⋯= new CTYPE(12.34); //←nothing exciting.
CTYPE obj double d = obj; //←`CTYPE` implicitly cast.
= (CTYPE)d; //←`double` explicitly cast. obj
The example is trivial, but does illustrate the syntax. This should not be used too often, but programmers should be aware of the possibilities. If in doubt, rather use explicit
.
Predefined Types
Unfortunately, swathes of documentation use either the term ‘intrinsic types’, or ‘built-in types’. Both terms are misleading. C# has no built-in types. All types are provided by the .NET Framework, where fundamental types and their behaviour are from the CTS (Common Type System). Only in un-managed (native code) sections, or if the whole C# program has been compiled with the ‘/unsafe
’ option, do these terms have a semblance of credence, because then C# has to interface with native C/C++ built-in types.
Even ‘predefined’ may be misconstrued — it might be better to consider the names that follow as aliases, or synonyms, for existing .NET types. Ultimately, they are just optional and succinct alternatives, with no additional meaning or behaviour with respect to the .NET types for which they are shorthand.
Miscellaneous Types
The following aliases have little in common with other types, which is why they are grouped here.
C# Name | .NET Type | Description / Range |
---|---|---|
object |
System.Object |
All types inherit from it at some point. |
bool |
System.Boolean |
true or false . |
Although not types in the same sense we describe the others, you should be aware of two type patterns in addition to the above:
type
[]
— A type that is shorthand for an array; specificallySystem.Array
"Miscellaneous Docs .NET System.Array"). The verbosity it eliminates, is significant. The whole pattern itself is a type, so: ‘type[][]’ is possible (an array-of-arrays); which is a type, so the process can be continued.public static int[] FRA() { //←function returns array. int[] arr = new int[4] { //←create array of `int`s & 11, 22, 33, 44 // initialise the array. }; return arr; } ⋯int[] result = FRA(); //←store returned array.
val-type
?
— In this case, val-type must be a value type, and the pattern is shorthand for:System.Nullable<
val-type>
. The angle brackets (<>
) are part of the type name, and val-type is said to be a template, or generic, argument. This wraps a value type, so that it becomes possible to assignnull
to a variable of a nullable type. It can be converted back to the original type with the cast operator.int? ni; //←like `int`, but nullable. <int> nj; //←long pattern for above. Nullable= null; nj = null; //←both can store `null`. ni int i, j; //←define two ‘normal’ `int`s. = (int)ni; j = (int)nj; //←convert nullables to `int`s. i
bool
Values of type bool
can only store true
or false
. They are C# built-in symbolic constants, although some documentation will group them with literals. They can be cast to numeric types, in which case false
will result in 0
, and true
in 1
.
The logical &&
(and), ||
(or) and !
(not) operators return true
or false
, and expect bool
operands.
The comparison operators, on the other hand, accept any supported types for their operands, but always return bool
, which is only reasonable. C#'s iteration and selection statements (barring switch
and foreach
), expect a bool
result for the conditional expression.
Expressions resulting in bool
, or requiring bool
int i = 0;
if (i == 0) { //←must be `true`.
while (i < 10) { //←loop while `true`.
Write("{0} ", i); //←do some ‘work’.
++i; //←change condition.
}
WriteLine();
}
for (int j = 0; j < 10; ++j) //←more succinct.
Write("{0} ", j); //←do some ‘work’.
WriteLine();
bool b = i == 10 && true; //←save result of `i == 10 && true`.
Write("b = {0}, {1}", b, (int)b); //←output `bool` and `int` values.
for (i = 0; i < 10; ++i) //←only print odd numbers.
if (IsOdd(i)) //←`IsOdd()` returns `bool`.
Write("{0} ", i);
WriteLine();
public static bool IsOdd (int x) {
return x % 2 == 1; //←return `bool` result.
}
object
All members of object
are inherited by all other types. This explains, for example, why all expressions of any type have a ToString
method. Every type generally overrides
it, since it is a virtual
. It is a reference type, and from the rules of object-oriented programming, you can assign a reference of any type to a variable, parameter, or return value of type object
.
Boxing & Unboxing
Syntactically, and to conform to OOP ideals, you can copy a value type (val-type) in a location expecting an object
. This means the compiler must create a reference, since value type objects do not contain references. This process is called boxing.
To retrieve the boxed value type, it must be unboxed. Syntax-wise, this means you must use the cast operator on the object
expression or variable.
Boxing and unboxing examples
object o;
int i = 123; //← arbitrary value type (`int`).☆
= i; //← legal, but special = ‘boxing’.
o = (int)o; //← cast to `int` = ‘unboxing’.
i object[] oa = new object[3]; //← an array of `object`s.
[0] = new int(123); //← boxing. store ref.
oa[1] = "ABC"; //← store new `string` ref.
oa[2] = null; //← `null` can go in any ref.
oaWriteLine("oa[0] = {0}", oa[0]); //← unboxing not necessary.
WriteLine("oa[1] = {1}", oa[1]); //← no unboxing.
☆ Obviously, instead of int
, any value type could have been used.
Character and String Types
Characters in the .NET Framework (and the Windows operating system) are 2 bytes in size, and use UTF-16 encoding. They are convertible to numeric types and vice versa, but not implicitly. The elements in a string are of type char
.
C# Name | .NET Type | Description / Range |
---|---|---|
char |
System.Char |
Any character. |
string |
System.String |
Reference type. Immutable. |
Strings are immutable. Any string transformation consequently involves a copy of the original with the modifications applied. This makes them safe to pass to functions, even though you are passing a reference to the actual string data.
TIP — Efficient String Concatenation
If you encounter a situation where long strings are created regularly by means of appending, consider using the Text.StringBuilder
class. It is very efficient at concatenating strings, and when done, you can simply use ToString
to retrieve a ‘proper’ string for further use.
Building strings in lieu of concatenation
string s1 = ""; //←initialise with ‘empty string’.
string s2 = new string('-', 10); //←initialise with 10 dashes.
= s2; //←both now reference 10 dashes.
s1 = s2.ToUpper() + "DEF"; //←new upper case & concat.
s2 char c = s2[1]; //←subscript returns `char`.
+= "XYZ"; //←append (inefficiently).
s2 var sb //←workhorse to the rescue.
= new Text.StringBuilder();
.Append("ABC"); //←append (efficiently).
sb.AppendFormat("-{0}-", 123); //←format & append.
sb= sb.ToString(); //←newly concatenated str. s1
Strings and characters are really easy to work with. The only caveat ever, is when concatenating long strings many times.
Character Literals
The sequence: '
char'
(a character between single quotes), has type char
(System.Char
), and is a ‘character literal’. The char can be a character on your keyboard, a UTF-8 character, or an escape sequence.
String Literals
The sequence: "
‹chars›"
is a string literal and has type string
(System.String
). It is possible to have no chars: ""
, making it an empty string. Like character literals, a literal string can contain escape sequences. A string literal can be prefixed with the ‘at’ character (@
), also called the ‘verbatim character’, in which case this string is called a ‘verbatim string’, where escape sequences have no meaning (interpreted literally).
@ Prefix — Further Uses
The @
prefix not only works for ‘verbatim strings’, but also in front of any identifier. This means that any keyword can be used as an identifier, as long as prefixed with the @
sign — not good practice, or even a good idea.
Since C# 6.0, the syntax supports string interpolation. This is triggered when a literal string is prefixed with the dollar sign ($
). If the string then contains expressions enclosed in curly braces, the expressions are evaluated and converted to their string representation (C# reference). This string representation replaces the whole curly-brace placeholder sequence inside the literal string.
String interpolation examples
using static System.Console;
⋯int I = 123;
string S = "ABC";
string X = $"I={I}, S=\"{S}\".";
WriteLine(X); //⇒ `I=123, S="ABC".`
WriteLine($"I={I}, S=\"{S}\"."); //⇒ `I=123, S="ABC".`
= String.Format( //←longer alternative.
X "I={0}, S=\"{1}\".", I, S); //
WriteLine(X); //⇒ `I=123, S="ABC".`
WriteLine( //⇒ `I=123, S="ABC".`
"I={0}, S=\"{1}\".", I, S); //⇒ `I=123, S="ABC".`
As you can see from the above examples, string interpolation can greatly simplify string formatting. We recommend you use it, as long as you remember this is only legal from C# 6 (Visual Studio 2015, in other words).
String Interpolation
From C#6, literal strings can optionally be prefixed with a dollar sign ($
). This enables string interpolation, which is a convenient alternative to String.Format
. You are encouraged to use this syntax, since it is more concise, more comprehensible, and less error-prone.
Like String.Format
, the syntax utilises matching curly braces do delimit a placeholder. But rather than specifying an offset to the argument to format into the string, string interpolation allows any valid C# expression between the braces. Contrast the following equivalent initialisations:
string s = $"PI * 2 = {Math.PI * 2.0}!";
string t = String.Format("PI * 2 = {0}!", Math.PI * 2.0);
It is difficult to argue that the string interpolation version is not more readable, more concise and have less potential for errors.
The expression in the placeholder can be formatted with the same formatting that String.Format
uses:
string s = $"PI * 2 = {Math.PI * 2.0:F4}!";
string t = String.Format("PI * 2 = {0:F4}!", Math.PI * 2.0);
It is a win-win: Easy syntax, leveraging existing knowledge, providing not only a better experience, but also better programs.
Escape Sequences
An escape sequence can be used as a character in literal characters, or literal strings. An escape sequence starts with a backslash (\
), making the backslash an escape character in this context. To represent a backslash as an actual character, it must be prefixed with another backslash: '\\'
(literal character), or inside a string literal: "⋯\\⋯"
.
To represent a single quote as a literal character, it must be escaped: '\''
, and to represent a double quote character inside a literal string, it must also be escaped: "⋯\"⋯"
.
The special sequence: \0
represents the null character. Other special sequences are: \n
(newline), \r
(carriage return), \a
(bell), \b
(backspace), \f
(form feed), \t
(tab) and \v
(vertical tab).
The numerical (hexadecimal) code for a character can be created with the sequence: \x
, followed by up to 4 hex digits (upper- or lower- case). We suggest you rather use the other option to represent Unicode values: \u
or \U
, followed by exactly four hex digits (they can be prefixed with 0
: \u00A0
).
Character Sizes and Encoding
The .NET Framework uses UTF-16 encoding for char
acters and string
s. This means that every character occupies 2 bytes in memory. This is never an issue, just something to be aware of. The .NET Framework has many options for converting to and from other encodings (from the System.Text
namespace).
We suggest for text storage and output, you convert to/from UTF-8. This is the default encoding for HTML and XML, for example. Writing to the console is automatically converted to your Console encoding, so you do not have to worry about that. The default text readers will also automatically convert from the input encoding to the internal UTF-16 encoding.
Integral / Integer Types
These are also called ‘integer’ types, but many take this to mean the int
type, so we prefer the term ‘integral’ unless the context is clear. (Nevertheless, we will never use ‘integer’ as a synonym for int
in any material we provide.)
C# Name | .NET Type | Description / Range |
---|---|---|
sbyte |
System.SByte |
-27⋯+27-1. |
byte |
System.Byte |
0⋯+28-1. |
short |
System.Int16 |
-215⋯+215-1. |
ushort |
System.UInt16 |
0⋯+216-1. |
int |
System.Int32 |
-231⋯+231-1. default integer type |
uint |
System.UInt32 |
0⋯+232-1. |
long |
System.Int64 |
-263⋯+263-1. |
ulong |
System.UInt64 |
0⋯+264-1. |
All arithmetic operators are available for integral types. When you mix the types of the operands, the smaller type is converted to the largest type in the expression.
The bitwise operators: &
(and), |
(or), ^
(xor), ~
(not), <<
(left shift) and >>
(right shift) only work with these integral types. The modulus or ‘remainder’ operator (%
) also expects only integral operands.
Integer Literals
Integer literals are constant values assumed by default to be in base 10 (decimal), and have type int
(System.Int32
). Suffixes can change the type, and it is possible to use notations for number bases other than 10: hexadecimal (base 16), octal (base 8) and binary (base 2).
Integer Literal Suffixes
Although the literal: 123
has by default type int
, by adding the L
suffix: 123L
, it now has type long
(System.Int64
). You can use the cast operator as well: (long)123
, but that is an expression containing an operator and an int
literal. Also: 123U
has type uint
(System.UInt32
), and 123UL
, has type ulong
(System.UInt64
).
Integer Literal Base Notation
To change the notation to hexadecimal, prefix a sequence of hex digits in the range 0…9,A…F
with 0x
(the x
can be upper case): 0x123
is hexadecimal value 123
, or 291
in decimal. You can use lower case hex digits a…f
as well.
The base notation does not change the type, so the value 0x123
still has type int
, while 0x123UL
has type ulong
, as an example. For octal notation, simply start the literal with 0
, so 0123
is in base 8 (octal), with decimal value (base 10): 83
. Avoid octal.
C# 7 — Numeric Literal Enhancements
From C# 7, you can prefix 0b
for base 2 (binary) literals: 0b01111011
is 123
in decimal. Also from C# 7, underscores can group digits in numeric literals: 0b0111_1011
, or 1_234_567
are both legal literals (of type int
).
Real / Floating Point Types
C# Name | .NET Type | Description / Range |
---|---|---|
float |
System.Single |
32-bits. |
double |
System.Double |
64-bits. Default floating point type |
decimal |
System.Decimal |
Currency. Less range, but higher precision. |
Floating Point Literals
A numeric literal which contains a decimal point is called a ‘floating point literal’. Instead of the term ‘floating point’, some documentation may use the term ‘real’. The default type of a floating pointer literal, e.g. 123.456
, is double
(System.Double
).
Floating Point Literal Suffixes
Floating point literals can be written in fixed-point notation, or exponential notation (sometimes called ‘scientific notation’): the value 123.456
is equivalent to 1.23456e2
, and both have type double
. You can legally also suffix a D
(for ‘double
’), but this is superfluous.
The F
suffix will cause the type to be float
(System.Single). You should use float
only when forced to. Please avoid it as much as you can.
The M
suffix (think: Money) will change the literal's type to decimal
(System.Decimal
): 12.3456M
. This is the appropriate type to use when you are working with currency values.
As noted before, in C# 7, you can use underscores to group digits. This also applies to floating point literals: 1_234.567_890
.
Introspection / Reflection
If you were to inspect the methods of object
, you would notice there is an instance method called GetType
, which returns a value with the name: Type
. If you were to look at the properties and methods of Type
, you may be surprised at the number of them.
Rummaging around, you may come across the related TypeInfo
class, and notice that there is a relationship between many classes leading to the Reflection
namespace. The term ‘reflection’ occurs in programming languages where it is possible to internally inspect any object (remember GetType
is available on all types of objects).
This topic is generally quite advanced, so we present it here more as a concept — all roads lead to Type
. You can easily experiment with Type
. For example, it has a Name
property. So, you can GetType
any object, save the return value in a variable of type Type
, and print out the type's Name
!
The is
operator also uses Reflection
to determine whether it should return true
or false
.
Summary
This is not a tutorial, but more of a reference with a few examples of where types are used. Although a new C# programmer might not initially be able to deal with all of these topics, they will eventually become required knowledge, as the sophistication of your code, and your experienc increases.
The key lesson to remember, especially in the beginning, is what we reiterated several times: every value has a type. Types govern space, characteristics and behaviour. Be aware of types at all times, and know what types you are working with (in the results of expressions, literals, etc.).
2021-11-29: Fix some Wikipedia links. [brx]
2021-03-31: Update API links to .NET 5.0; and to Wikipedia mobile. [brx]
2020-03-17: Syntax elements; new tables; type inference. [brx]
2019-10-15: Fixed typos in some comments. [brx]
2019-04-24: Fixed string literal error. [brx]
2018-06-14: Fixes & clarifications. [brx]
2018-06-12: Small additions (mainly string interpolation). [brx]
2017-12-01: Editing. [jjc]
2017-11-26: Additional topics. Editing. [brx;jjc]
2017-11-25: Created. [brx]