1. What is .NET?
.NET is a general-purpose software development platform, similar to Java. At
its core is a virtual machine that turns intermediate language (IL) into
machine code. High-level language compilers for C#, VB.NET and C++ are
provided to turn source code into IL. C# is a new programming language,
very similar to Java. An extensive class library is included, featuring all the
functionality one might expect from a contempory development platform -
windows GUI development (Windows Form s), database access (ADO.NET),
web development (ASP.NET), web services, XML etc.
2. What operating systems does the .NET Framework run on?
The runtime supports Windows Server 2003, Windows XP, Windows 2000,
NT4 SP6a and Windows ME/98. Windows 95 is not supported. Some parts of
the framework do not work on all platforms - for example, ASP.NET is only
supported on XP and Windows 2000/2003. Windows 98/ME cannot be used
for development.
IIS is not supported on Windows XP Home Edition, and so cannot be used to
host ASP.NET. However, the ASP.NET Web Matrix web server does run on XP
Home.
The .NET Compact Framework is a version of the .NET Framework for mobile
devices, running Windows CE or Windows Mobile.
The Mono project has a version of the .NET Framework that runs on
Linux.
Terminology
3. What is the CLI? Is it the same as the CLR?
The CLI (Common Language Infrastructure) is the definition of the fundamentals of
the .NET framework - the Common Type System (CTS), metadata, the Virtual
Execution Environment (VES) and its use of intermediate language (IL), and the
support of multiple programming languages via the Common Language Specification
(CLS). The CLI is documented through ECMA - see
http://msdn.microsoft.com/net/ecma/ for more details.
The CLR (Common Language Runtime) is Microsoft's primary implementation of the
CLI. Microsoft also have a shared source implementation known as ROTOR, for
educational purposes, as well as the .NET Compact Framework for mobile devices.
Non-Microsoft CLI implementations include Mono and DotGNU Portable. NET.
4 What is the CTS, and how does it relate to the CLS?
CTS = Common Type System. This is the full range of types that the .NET
runtime understands. Not all .NET languages support all the types in the
CTS.
CLS = Common Language Specification. This is a subset of the CTS which all
.NET languages are expected to support. The idea is that any program which
uses CLS-compliant types can interoperate with any .NET program written in
any language. This interop is very fine-grained - for example a VB.NET class
can inherit from a C# class.
5 What is IL?
IL = Intermediate Language. Also known as MSIL (Microsoft Intermediate
Language) or CIL (Common Intermediate Language). All .NET source code
(of any language) is compiled to IL during development. The IL is then
converted to machine code at the point where the software is installed, or
(more commonly) at run-time by a Just-In-Time (JIT) compiler.
6 What is C#?
C# is a new language designed by Microsoft to work with the .NET
framework. In their "Introduction to C#" whitepaper, Microsoft describe C#
as follows:
"C# is a simple, modern, object oriented, and type-safe programming
language derived from C and C++. C# (pronounced “C sharp”) is firmly
planted in the C and C++ family tree of languages, and will immediately be
familiar to C and C++ programmers. C# aims to combine the high
productivity of Visual Basic and the raw power of C++."
Substitute 'Java' for 'C#' in the quote above, and you'll see that the
statement still works pretty well :-).
7 What does 'managed' mean in the .NET context?
The term 'managed' is the cause of much confusion. It is used in various
places within .NET, meaning slightly different things.
Managed code: The .NET framework provides several core run-time services
to the programs that run within it - for example exception handling and
security. For these services to work, the code must provide a minimum level
of information to the runtime. Such code is called managed code.
Managed data: This is data that is allocated and freed by the .NET runtime's
garbage collector.
Satish Marwat Dot Net Web Resources satishcm@gmail.com 5 Page
Managed classes: This is usually referred to in the context of Managed
Extensions (ME) for C++. When using ME C++, a class can be marked with
the __gc keyword. As the name suggests, this means that the memory for
instances of the class is managed by the garbage collector, but it also means
more than that. The class becomes a fully paid-up member of the .NET
community with the benefits and restrictions that brings. An example of a
benefit is proper interop with classes written in other languages - for
example, a managed C++ class can inherit from a VB class. An example of a
restriction is that a managed class can only inherit from one base class.
8 What is reflection?
All .NET compilers produce metadata about the types defined in the modules
they produce. This metadata is packaged along with the module (modules in
turn are packaged together in assemblies), and can be accessed by a
mechanism called reflection. The System.Reflection namespace contains
classes that can be used to interrogate the types for a module/assembly.
Using reflection to access .NET metadata is very similar to using
ITypeLib/ITypeInfo to access type library data in COM, and it is used for
similar purposes - e.g. determining data type sizes for marshaling data
across context/process/machine boundaries.
Reflection can also be used to dynamically invoke methods (see
System.Type.InvokeMember), or even create types dynamically at run-time
(see System.Reflection.Emit.TypeBuilder).
Assemblies
9 What is an assembly?
An assembly is sometimes described as a logical .EXE or .DLL, and can be an
application (with a main entry point) or a library. An assembly consists of
one or more files (dlls, exes, html files etc), and represents a group of
resources, type definitions, and implementations of those types. An assembly
may also contain references to other assemblies. These resources, types and
references are described in a block of data called a manifest. The manifest is
part of the assembly, thus making the assembly self-describing.
An important aspect of assemblies is that they are part of the identity of a
type. The identity of a type is the assembly that houses it combined with the
type name. This means, for example, that if assembly A exports a type called
T, and assembly B exports a type called T, the .NET runtime sees these as
two completely different types. Furthermore, don't get confused between
assemblies and namespaces - namespaces are merely a hierarchical way of
organising type names. To the runtime, type names are type names,
regardless of whether namespaces are used to organise the names.
assembly plus the typename (regardless of whether the type name belongs
to a namespace) that uniquely indentifies a type to the runtime.
Assemblies are also important in .NET with respect to security - many of the
security restrictions are enforced at the assembly boundary.
Finally, assemblies are the unit of versioning in .NET - more on this below.
10 How can I produce an assembly?
The simplest way to produce an assembly is directly from a .NET compiler.
For example, the following C# program:
public class CTest
{
public CTest() { System.Console.WriteLine( "Hello from CTest" ); }
}
can be compiled into a library assembly (dll) like this:
csc /t:library ctest.cs
You can then view the contents of the assembly by running the "IL
Disassembler" tool that comes with the .NET SDK.
Alternatively you can compile your source into modules, and then combine
the modules into an assembly using the assembly linker (al.exe). For the C#
compiler, the /target:module switch is used to generate a module instead of
an assembly.
11 What is the difference between a private assembly and a
shared assembly?
· Location and visibility: A private assembly is normally used by a
single application, and is stored in the application's directory, or a subdirectory
beneath. A shared assembly is normally stored in the global
assembly cache, which is a repository of assemblies maintained by the
.NET runtime. Shared assemblies are usually libraries of code which
many applications will find useful, e.g. the .NET framework classes.
· Versioning: The runtime enforces versioning constraints only on
shared assemblies, not on private assemblies.
12 How do assemblies find each other?
By searching directory paths. There are several factors which can affect the
path (such as the AppDomain host, and application configuration files), but
for private assemblies the search path is normally the application's directory
and its sub-directories. For shared assemblies, the search path is normally
same as the private assembly path plus the shared assembly cache.
13 How does assembly versioning work?
Each assembly has a version number called the compatibility version. Also
each reference to an assembly (from another assembly) includes both the
name and version of the referenced assembly.
The version number has four numeric parts (e.g. 5.5.2.33). Assemblies with
either of the first two parts different are normally viewed as incompatible. If
the first two parts are the same, but the third is different, the assemblies are
deemed as 'maybe compatible'. If only the fourth part is different, the
assemblies are deemed compatible. However, this is just the default
guideline - it is the version policy that decides to what extent these rules are
enforced. The version policy can be specified via the application configuration
file.
Remember: versioning is only applied to shared assemblies, not private
assemblies.
Application Domains
What is an application domain?
An AppDomain can be thought of as a lightweight process. Multiple
AppDomains can exist inside a Win32 process. The primary purpose of the
AppDomain is to isolate applications from each other, and so it is particularly
useful in hosting scenarios such as ASP.NET. An AppDomain can be
destroyed by the host without affecting other AppDomains in the process.
Win32 processes provide isolation by having distinct memory address spaces.
This is effective, but expensive. The .NET runtime enforces AppDomain
isolation by keeping control over the use of memory - all memory in the
AppDomain is managed by the .NET runtime, so the runtime can ensure that
AppDomains do not access each other's memory.
One non-obvious use of AppDomains is for unloading types. Currently the
only way to unload a .NET type is to destroy the AppDomain it is loaded into.
This is particularly useful if you create and destroy types on-the-fly via
reflection.
How does an AppDomain get created?
AppDomains are usually created by hosts. Examples of hosts are the
Windows Shell, ASP.NET and IE. When you run a .NET application from the
command-line, the host is the Shell. The Shell creates a new AppDomain for
every application.
AppDomains can also be explicitly created by .NET applications. Here is a C#
sample which creates an AppDomain, creates an instance of an object inside
it, and then executes one of the object's methods:
using System;
using System.Runtime.Remoting;
using System.Reflection;
public class CAppDomainInfo : MarshalByRefObject
{
public string GetName() { return AppDomain.CurrentDomain.FriendlyName; }
}
public class App
{
public static int Main()
{
AppDomain ad = AppDomain.CreateDomain( "Andy's new domain" );
CAppDomainInfo adInfo = (CAppDomainInfo)ad.CreateInstanceAndUnwrap(
Assembly.GetCallingAssembly().GetName().Name, "CAppDomainInfo" );
Console.WriteLine( "Created AppDomain name = " + adInfo.GetName() );
return 0;
}
}
Can I write my own .NET host?
Yes. For an example of how to do this, take a look at the source for the
dm.net moniker developed by Jason Whittington and Don Box. There is also
a code sample in the .NET SDK called CorHost.
Garbage Collection
What is garbage collection?
Garbage collection is a heap-management strategy where a run-time
component takes responsibility for managing the lifetime of the memory used
by objects. This concept is not new to .NET - Java and many other
languages/runtimes have used garbage collection for some time.
Is it true that objects don't always get destroyed
immediately when the last reference goes away?
Yes. The garbage collector offers no guarantees about the time when an
object will be destroyed and its memory reclaimed.
There was an interesting thread on the DOTNET list, started by Chris Sells,
about the implications of non-deterministic destruction of objects in C#. In
October 2000, Microsoft's Brian Harry posted a lengthy analysis of the
problem. Chris Sells' response to Brian's posting is here.
Why doesn't the .NET runtime offer deterministic
destruction?
Because of the garbage collection algorithm. The .NET garbage collector
works by periodically running through a list of all the objects that are
currently being referenced by an application. All the objects that it doesn't
find during this search are ready to be destroyed and the memory reclaimed.
The implication of this algorithm is that the runtime doesn't get notified
immediately when the final reference on an object goes away - it only finds
out during the next 'sweep' of the heap.
Futhermore, this type of algorithm works best by performing the garbage
collection sweep as rarely as possible. Normally heap exhaustion is the
trigger for a collection sweep.
Is the lack of deterministic destruction in .NET a problem?
It's certainly an issue that affects component design. If you have objects that
maintain expensive or scarce resources (e.g. database locks), you need to
provide some way to tell the object to release the resource when it is done.
Microsoft recommend that you provide a method called Dispose() for this
purpose. However, this causes problems for distributed objects - in a
distributed system who calls the Dispose() method? Some form of referencecounting
or ownership-management mechanism is needed to handle
distributed objects - unfortunately the runtime offers no help with this.
Should I implement Finalize on my class? Should I
implement IDisposable?
This issue is a little more complex than it first appears. There are really two
categories of class that require deterministic destruction - the first category
manipulate unmanaged types directly, whereas the second category
manipulate managed types that require deterministic destruction. An
example of the first category is a class with an IntPtr member representing
an OS file handle. An example of the second category is a class with a
System.IO.FileStream member
For the first category, it makes sense to implement IDisposable and override
Finalize. This allows the object user to 'do the right thing' by calling Dispose,
but also provides a fallback of freeing the unmanaged resource in the
Finalizer, should the calling code fail in its duty. However this logic does not
apply to the second category of class, with only managed resources. In this
case implementing Finalize is pointless, as managed member objects cannot
be accessed in the Finalizer. This is because there is no guarantee about the
ordering of Finalizer execution. So only the Dispose method should be
implemented. (If you think about it, it doesn't really make sense to call
Dispose on member objects from a Finalizer anyway, as the member object's
Finalizer will do the required cleanup.)
For classes that need to implement IDisposable and override Finalize, see
Microsoft's documented pattern.
Note that some developers argue that implementing a Finalizer is always a
bad idea, as it hides a bug in your code (i.e. the lack of a Dispose call). A
less radical approach is to implement Finalize but include a Debug.Assert at
the start, thus signalling the problem in developer builds but allowing the
cleanup to occur in release builds.
Do I have any control over the garbage collection
algorithm?
A little. For example the System.GC class exposes a Collect method, which
forces the garbage collector to collect all unreferenced objects immediately.
Also there is a gcConcurrent setting that can be specified via the application
configuration file. This specifies whether or not the garbage collector
performs some of its collection activities on a separate thread. The setting
only applies on multi-processor machines, and defaults to true.
How can I find out what the garbage collector is doing?
Lots of interesting statistics are exported from the .NET runtime via the '.NET
CLR xxx' performance counters. Use Performance Monitor to view them.
What is the lapsed listener problem?
The lapsed listener problem is one of the primary causes of leaks in .NET
applications. It occurs when a subscriber (or 'listener') signs up for a
publisher's event, but fails to unsubscribe. The failure to unsubscribe means
that the publisher maintains a reference to the subscriber as long as the
publisher is alive. For some publishers, this may be the duration of the
application.
This situation causes two problems. The obvious problem is the leakage of
the subscriber object. The other problem is the performance degredation due
to the publisher sending redundant notifications to 'zombie' subscribers.
There are at least a couple of solutions to the problem. The simplest is to
make sure the subscriber is unsubscribed from the publisher, typically by
adding an Unsubscribe() method to the subscriber. Another solution,
documented here by Shawn Van Ness, is to change the publisher to use weak
references in its subscriber list.
5.9 When do I need to use GC.KeepAlive?
It's very unintuitive, but the runtime can decide that an object is garbage
much sooner than you expect. More specifically, an object can become
garbage while a method is executing on the object, which is contrary to most
developers' expectations. Chris Brumme explains the issue on his blog. I've
taken Chris's code and expanded it into a full app that you can play with if
you want to prove to yourself that this is a real problem:
using System;
using System.Runtime.InteropServices;
class Win32
{
[DllImport("kernel32.dll")]
public static extern IntPtr CreateEvent( IntPtr lpEventAttributes,
bool bManualReset,bool bInitialState, string lpName);
[DllImport("kernel32.dll", SetLastError=true)]
public static extern bool CloseHandle(IntPtr hObject);
[DllImport("kernel32.dll")]
public static extern bool SetEvent(IntPtr hEvent);
}
class EventUser
{
public EventUser()
{
hEvent = Win32.CreateEvent( IntPtr.Zero, false, false, null );
}
~EventUser()
{
Win32.CloseHandle( hEvent );
Console.WriteLine("EventUser finalized");
}
public void UseEvent()
{
UseEventInStatic( this.hEvent );
}
static void UseEventInStatic( IntPtr hEvent )
{
//GC.Collect();
bool bSuccess = Win32.SetEvent( hEvent );
Console.WriteLine( "SetEvent " + (bSuccess ? "succeeded" : "FAILED!") );
}
IntPtr hEvent;
}
class App
{
static void Main(string[] args)
{
EventUser eventUser = new EventUser();
eventUser.UseEvent();
}
}
If you run this code, it'll probably work fine, and you'll get the following
output:
SetEvent succeeded
EventDemo finalized
However, if you uncomment the GC.Collect() call in the UseEventInStatic()
method, you'll get this output:
EventDemo finalized
SetEvent FAILED!
(Note that you need to use a release build to reproduce this problem.)
So what's happening here? Well, at the point where UseEvent() calls
UseEventInStatic(), a copy is taken of the hEvent field, and there are no
further references to the EventUser object anywhere in the code. So as far as
the runtime is concerned, the EventUser object is garbage and can be
collected. Normally of course the collection won't happen immediately, so
you'll get away with it, but sooner or later a collection will occur at the wrong
time, and your app will fail.
A solution to this problem is to add a call to GC.KeepAlive(this) to the end of
the UseEvent method, as Chris explains.
No comments:
Post a Comment