This is why it's absolutely essential to always be aware of which types are value types in the .NET type system. Attempts by either users or the designers to conflate or unify them are ill-conceived. The nullability semantics could have been kept consistent if they went all the way back to the beginning of the CLR and did it that way, but this would have not been consistent with Java's JVM and type system (which they were trying to mimic).
That said, we already have value types like System.Int32 which inherit from System.ValueType (an abstract type) which inherits from System.Object (a non-abstract reference type), so things are already a bit weird.
pjmlp 8 hours ago [-]
The class/struct approach predates C#.
For example in Eiffel by default they are references, but can also be turned into value types if the are blesses ones like numeric types, or if the developer tags them as expanded classes, either at definition or declaration type.
Delphi makes the distinction between classical object, and records from Object Pascal (both value based, and explicit pointers are required), or class types, heap only.
Modula-3 classes and records also follow similar approach, OBJECT follows the same semantic model of REF RECORDS.
For more modern examples, also D and Swift follow this approach.
And plenty of other examples for anyone wanting to dive into SIGPLAN.
tialaramex 8 hours ago [-]
It is in hindsight unfortunate that this idea didn't get into the CLR itself.
The blog post doesn't mention this, but whereas if I say my function only takes this 32-bit signed integer value type, a VB.NET caller can't hand it "null" because that's not a 32-bit signed integer - if instead I say it takes string (not the nullable string?) too bad, the VB.NET caller can just pass null anyway because in the CLR there's no distinction.
You're actually expected (if you provide public surface) to write a null test, in your C# code which explicitly says in the function signature that this mustn't be null, otherwise it might blow up at runtime because the CLR doesn't care.
pjmlp 8 hours ago [-]
Nullable types came into C# during the .NET Framework days, back when the team had very high bar to ever add features that would change the runtime as OS component.
Hence why async/await is such a mess of IL bytecode, as it was implemented in userspace so to say.
Only with .NET Core, they followed other languages in not tying the runtime to the OS, a lesson that Google also had to learn (ART is updatable via PlayStore since Android 12).
bob1029 8 hours ago [-]
> That said, we already have value types like System.Int32 which inherit from System.ValueType (an abstract type) which inherits from System.Object (a non-abstract reference type), so things are already a bit weird.
But it all works, right? The runtime can do anything it wants to with the IL. The handling of Vector<T> is a good example of this - Locating arbitrary types/namespaces and emitting special instructions based upon the current machine's capabilities. Normalizing value vs reference semantics would be a tiny drop in this bucket.
dmatech 8 hours ago [-]
There's some really interesting stuff about how that works here:
Truthfully I think the language lost something critically important when we went gung-ho on purging ourselves of the EVILS! of nullable types.
Back in the Paleolithic, the only way to get a nullable value type was to extremely explicitly box it into a Nullable<T>. The distinction between value and reference types was crystal clear and unmistakable. Boxing values required an active and deliberate decision.
Now I guess we just box everything because null checks are hard or something.
chrisoverzero 8 hours ago [-]
> Luckily, since type constraints are part of the signature of the method and there is no ambiguity, I am allowed to make this overload.
This isn’t the case. It’s allowed because the question-mark syntax means two different things in value- and reference-type contexts. The signatures really look like this:
public static IEnumerable<TR> SelectNotNull<T, TR>(
this IEnumerable<T> source,
Func<T, TR> fn)
where TR : class // …and the nullability of TR is tracked by the compiler
public static IEnumerable<TR> SelectNotNull<T, TR>(
this IEnumerable<T> source,
Func<T, Nullable<TR>> fn)
where TR : struct
This is an allowable overload.
jibal 9 hours ago [-]
> to denote two completely separate concepts
No, same concept ... you're making a mistake that some call "implementation on the brain". That they're the same concept is why you're able to specify a common operation, SelectNotNull. That you had to provide an explicit type constraint that a compiler should be able to infer doesn't change that.
Dwedit 9 hours ago [-]
Please remove the uppercase on the second "Nullable" in the headline.
manuc66 12 hours ago [-]
@Bogdanp maybe this is another way to express it:
public static class EnumerableExtensions
{
public static IEnumerable<TR> SelectNotNull<T, TR>(
this IEnumerable<T> source,
Func<T, TR?> fn)
where TR : class
{
return source.Select(fn)
.Where(it => it != null)
.OfType<T>();
}
public static IEnumerable<TR> SelectNotNull<T, TR>(
this IEnumerable<T> source,
Func<T, TR?> fn)
where TR : struct
{
return source.Select(fn)
.Where(it => it != null)
.Select(item => item.Value);
}
}
Uvix 9 hours ago [-]
Both look like they’ll work. For the struct case I think yours is better. For the class case I think using Cast() like the original instead of OfType() makes more sense - you don’t need to filter out other types from the enumerable, so cast once instead of casting then performing a useless check on the result.
nick_ 2 hours ago [-]
I'm pretty sure you only need the OfType call. It filters out nulls already.
mrcsharp 8 hours ago [-]
As a side note: please avoid, as much as possible, putting `.Select(..)` before `.Where(..)`. You are wasting CPU cycles and memory space by forcing LINQ to map all the items and then filtering on the mapped value.
In most situations, you should be able to filter on the source enumerable before mapping making the whole thing more efficient.
Additionally, that `.Cast<TR>(..)` at the end should have been a dead giveaway that you are going down the wrong path here. You are incurring even more CPU and Memory costs as the `.Cast<TR>(..)` call will now iterate through all the items needlessly.[1]
Also, the design of this this method doesn't seem to make much difference to me anyways:
```
var strs = source.SelectNotNull(it => it);
```
vs
```
var strs = source.Where(it => it != null);
```
A lot of other LINQ extension methods allow you to pass in a predicate expression that will be executed on the source enumerable:
>Also, the design of this this method doesn't seem to make much difference to me anyways:
``` var strs = source.SelectNotNull(it => it); ```
vs
``` var strs = source.Where(it => it != null); ```
Wouldn't the first be IEnumerable<TR> and the second be IEnumerable<TR?>
I imagine that's the main driver for creating SelectNotNull, so that you get the nonnullable type out of the Linq query
mrcsharp 7 hours ago [-]
> I imagine that's the main driver for creating SelectNotNull
Sure. And now we are fighting the compiler and in the process writing less efficient code.
The compiler gives us a way to deal with this situation. It is all about being absolutely clear with intentions. Yes, Where(..) in my example would return IEnumerable<TR?> but then in subsequent code I can tell the compiler that I know for a fact that TR? is actually TR by using the null forgiving operator (!).
angrysaki 7 hours ago [-]
>The compiler gives us a way to deal with this situation. It is all about being absolutely clear with intentions. Yes, Where(..) in my example would return IEnumerable<TR?> but then in subsequent code I can tell the compiler that I know for a fact that TR? is actually TR by using the null forgiving operator (!).
I guess that seems way less clear with intentions to me. If I have an array of potentially null types and I want to filter out the not nulls, I'd much rather have an operation that returns a T[] vs a T?[].
I should also note that I also have a "IEnumerable<T> WhereNotNull(IEnumerable<T>?)" function in my codebase, but I implemented it using a foreach/yield which doesn't suffer from the extra Cast<>()
LtWorf 8 hours ago [-]
If I was able to write a simple optimiser for relational algebra, I'm sure microsoft engineers can come up with something :D
bazoom42 8 hours ago [-]
Nullability on refererence types is boltet on in an ugly way in C#. Still a very valuable feature though.
orthoxerox 8 hours ago [-]
OP's code only works because the method goes from T? to T. If it accepted T and returned T?, he would've gotten an error like this:
CS0111: Type 'Utils' already defines a member called 'Foo' with the same parameter types
That's because the "type constraints are part of the signature of the method and there is no ambiguity" statement is wrong. They are not.
gwbas1c 7 hours ago [-]
Uhm: There is the OfType method that is generally used to filter an IEnumerable<T?> to IEnumerable<T>. It doesn't care about reference vs value types, because it doesn't compare to null. Instead it checks the type of each element.
Why? If you have an value, foo, that's declared "int?", (foo is int) evaluates to if there is a value present, and false if there is no value present. The same thing happens if foo is declared as string?.
BTW: I checked if the overload can be avoided by using the "default" keyword. It can't.
That said, we already have value types like System.Int32 which inherit from System.ValueType (an abstract type) which inherits from System.Object (a non-abstract reference type), so things are already a bit weird.
For example in Eiffel by default they are references, but can also be turned into value types if the are blesses ones like numeric types, or if the developer tags them as expanded classes, either at definition or declaration type.
Delphi makes the distinction between classical object, and records from Object Pascal (both value based, and explicit pointers are required), or class types, heap only.
Modula-3 classes and records also follow similar approach, OBJECT follows the same semantic model of REF RECORDS.
For more modern examples, also D and Swift follow this approach.
And plenty of other examples for anyone wanting to dive into SIGPLAN.
The blog post doesn't mention this, but whereas if I say my function only takes this 32-bit signed integer value type, a VB.NET caller can't hand it "null" because that's not a 32-bit signed integer - if instead I say it takes string (not the nullable string?) too bad, the VB.NET caller can just pass null anyway because in the CLR there's no distinction.
You're actually expected (if you provide public surface) to write a null test, in your C# code which explicitly says in the function signature that this mustn't be null, otherwise it might blow up at runtime because the CLR doesn't care.
Hence why async/await is such a mess of IL bytecode, as it was implemented in userspace so to say.
Only with .NET Core, they followed other languages in not tying the runtime to the OS, a lesson that Google also had to learn (ART is updatable via PlayStore since Android 12).
But it all works, right? The runtime can do anything it wants to with the IL. The handling of Vector<T> is a good example of this - Locating arbitrary types/namespaces and emitting special instructions based upon the current machine's capabilities. Normalizing value vs reference semantics would be a tiny drop in this bucket.
https://stackoverflow.com/a/56392846/7077511
Back in the Paleolithic, the only way to get a nullable value type was to extremely explicitly box it into a Nullable<T>. The distinction between value and reference types was crystal clear and unmistakable. Boxing values required an active and deliberate decision.
Now I guess we just box everything because null checks are hard or something.
This isn’t the case. It’s allowed because the question-mark syntax means two different things in value- and reference-type contexts. The signatures really look like this:
This is an allowable overload.No, same concept ... you're making a mistake that some call "implementation on the brain". That they're the same concept is why you're able to specify a common operation, SelectNotNull. That you had to provide an explicit type constraint that a compiler should be able to infer doesn't change that.
public static class EnumerableExtensions {
public static IEnumerable<TR> SelectNotNull<T, TR>( this IEnumerable<T> source, Func<T, TR?> fn) where TR : class { return source.Select(fn) .Where(it => it != null) .OfType<T>(); }
public static IEnumerable<TR> SelectNotNull<T, TR>( this IEnumerable<T> source, Func<T, TR?> fn) where TR : struct { return source.Select(fn) .Where(it => it != null) .Select(item => item.Value); } }
In most situations, you should be able to filter on the source enumerable before mapping making the whole thing more efficient.
Additionally, that `.Cast<TR>(..)` at the end should have been a dead giveaway that you are going down the wrong path here. You are incurring even more CPU and Memory costs as the `.Cast<TR>(..)` call will now iterate through all the items needlessly.[1]
Also, the design of this this method doesn't seem to make much difference to me anyways:
``` var strs = source.SelectNotNull(it => it); ```
vs
``` var strs = source.Where(it => it != null); ```
A lot of other LINQ extension methods allow you to pass in a predicate expression that will be executed on the source enumerable:
``` var str = source.First(it => it != null); ```
[1] https://source.dot.net/#System.Linq/System/Linq/Cast.cs,152b...
``` var strs = source.SelectNotNull(it => it); ```
vs
``` var strs = source.Where(it => it != null); ```
Wouldn't the first be IEnumerable<TR> and the second be IEnumerable<TR?>
I imagine that's the main driver for creating SelectNotNull, so that you get the nonnullable type out of the Linq query
Sure. And now we are fighting the compiler and in the process writing less efficient code.
The compiler gives us a way to deal with this situation. It is all about being absolutely clear with intentions. Yes, Where(..) in my example would return IEnumerable<TR?> but then in subsequent code I can tell the compiler that I know for a fact that TR? is actually TR by using the null forgiving operator (!).
I guess that seems way less clear with intentions to me. If I have an array of potentially null types and I want to filter out the not nulls, I'd much rather have an operation that returns a T[] vs a T?[].
I should also note that I also have a "IEnumerable<T> WhereNotNull(IEnumerable<T>?)" function in my codebase, but I implemented it using a foreach/yield which doesn't suffer from the extra Cast<>()
Why? If you have an value, foo, that's declared "int?", (foo is int) evaluates to if there is a value present, and false if there is no value present. The same thing happens if foo is declared as string?.
BTW: I checked if the overload can be avoided by using the "default" keyword. It can't.