Smart spreadsheets, and use of spreadsheets as a tool for regular programming

I love spreadsheets. It’s possibly the best programming tool ever created. But I would like them to be even smarter. In Excel you will normally have to define an expression as just one single line, for example

=A1 * IF(A1>10;2;3)

which is not very readable. Why not make it possible to apply normal code, like for example Kotlin, in a spreadsheet? Instead of a one-liner there could be a real multi-line expression, like

val factor = a1 > 10 ? 2 : 3
a1 * factor

Define functions
Further, it should be possible to define your own functions, like we do in regular programming, so that we are not limited by the pre-defined functions defined in Excel (or other spreadsheet software).

Static typing and OOP
Excel is all procedural, with all functions in one global namespace. You don’t have the possibility to go on an object and call methods. By using Kotlin as the language for the spreadsheet, it would be possible to, for example, go on an int and get suggestions to the possible methods to call.
It would also be possible to define your own classes and use them in the spreadsheet.

Use spreadsheets as part of regular programming
In many cases spreadsheets would be easier to navigate than regular “pure” code. Of course we should not solve all our programming problems with a spreadsheet, but I think it would be very useful to use spreadsheets as part of regular programming.

Example:

val sheet = StockAnalysisSpreadsheet(stocks)
val scorePerStock : Map<Stock, Double> = sheet.scorePerStock

The StockAnalysisSpreadsheet class that derives from AutoGeneratedStockAnalyisSpreadsheet which is generated by the spreadsheet software:

class StockAnalysisSpreadsheet(stocks: List<Stock>) extends AutoGeneratedStockAnalyisSpreadsheet(stocks) {

  val scorePerStock : Map<Stock, Double>
    get() = b15 // All cells in the spreadsheet are available here as properties
}

Here we send stocks as an input to the sheet, and we get out a score for each stock. We can add more getters if we want access to more cells.

(There may be lots of improvements here. Extension functions may be a better approach than an auto generated class.)

This might seem a bit confusing, but try to imagine the opportunities to feed a spreadsheet with values from the code. Here the stocks may go to b1 in the spread sheet, and the values get extracted to proceding columns. Then calculations are applied on these values, before we finally get the Scores in b15. A difference for normal spread sheets is that we can have a list within a specific cell, b1 is a List<Stock>. Then we apply operations to each single stock in b2, b3 etc. Let’s b2 contains the Price to earnings-ratios:

b1.map(Stock::priceToEarningsRatio)

When a cell contains a list, the cell will get many sub cells, and as long as we keep the lists parallel (b1[5] represents same entity (in this case stock) as b2[5]) we can easily compare the values as in a normal spreadsheet, while keeping it dynamic. While developing, we can use example data as input. In this case we would specify a list of stocks that go to b1, and we will see how the values change why we create the spreadsheet.

Smarter spreadsheet, and mix between spread sheets and regular code would be useful especially in data science/quant analysis, and function as a bridge between programmers and quant analysts.

I guess there will be some questions here. If you are interested, I can create some more examples on this idea (and it’s just an idea yet, nothing is developed so far).

What do you think? Does something like this already exist?

I may not be following the same threads of thought as you, here - just throwing around some ideas.

First, having done a little bit of Office programming, I’ll mention a couple of things. VBA is a painful programming environment not only because it’s not really OOP but also because the source lives inside the binary spreadsheet, so you can’t easily do diffs etc. when working with version control. You can manipulate Office apps / files from outside via COM, so then you can use whatever language you like, but that’s pretty awkward, too.

More interesting (to me) is the idea that spreadsheets are a handy GUI for a constrained form of Dataflow programming. Reactive programming can be seen as something similar: user edits to non-formula cells as new values in “hot” streams, and cell formulae as the expressions you use to combine streams. Whether the stream values are scalars, lists, or whatever doesn’t matter to a reactive program.

The other thing spreadsheets add over the basic dataflow/reactive approach is having 1D or 2D operations across ranges of cells. I can vaguely imagine some code for having “reactive nodes” in lists, grids, or indeed and structure you like, and composite operations across such structures.

Hope those are some useful thoughts, despite being quite abstract!

1 Like

Smarter spreadsheets has a name: “Python Pandas”. I ignore if there is some related project based on Kotlin right now.

Thanks for your thoughts @HughG, and sorry for late answer. I am on a mini-retirement and needed a break from software. Now I’m back and fit for fight (with the bugs) again. I will give a much simpler example this time.

Let’s first start with the concept of lists, which is currently missing in spreadsheets (at least as I am aware of).

In traditional spreadsheets, each entry in a list has to be on a separate row with a separate name, a2, a3, a4 are all rows with names:

a b c
1 Name Age
2 Chris 35
3 Alex 23
4 Zara 45
5

When you want to use the list somewhere, you will need to write a2:a4 in order to get a list of names. But what if the list suddenly consist of 8 elements? Then the hard-coded list a2:a4 won’t work anymore.

I want to use lists within a single cell. Let’s merge the names and the ages into one single row in the spreadsheet. The names are in in a2, with type List<String> and ages in b2 with type List<Int>:

a b c
1 Name Age
2 [Chris, Alex, Zara] [35, 23, 45]
3

But this looks awful. It should be possible in the spreadsheet software to expand the lists so that we see each entry as a sub row within each cell. In same way as we can format a cell with bold it should also be possible to format a list as expanded:

a b c
1 Name Age
2
[0] Chris
[1] Alex
[2] Zara
[0] 35
[1] 23
[2] 45
3

Wow! This looks a lot better! But it seems a bit bad with the repetition of the indexes. These lists are parallel: The age of Chris is 35 since they are both index 0 etc. So why not tell this to the spreadsheet. We mark a2 and b2, right click and select “Make parallel lists” (in my dream spreadsheet software). The cells get then numbering only on the first cell (to the left):

a b c
1 Name Age
2
[0] Chris
[1] Alex
[2] Zara
35
23
45
3

(There could also be a border or something telling that the lists are parallel, but didn’t manage to do that in the editor here)

In cases where the hole row consist of parallel lists, we can make the indexing general for the row. We would mark the whole row, and, as we did before, select “Make parallel lists”:

a b c
1 Name Age
2
[0]
[1]
[2]
Chris
Alex
Zara
35
23
45
3

(It could look at bit better. I had to hack it a bit here because of lack of colspan in the editor).

So how do we get these lists?

There could be a concept of inputs to the spread. Here we will maybe want to have an input called persons of type List<Person>.

data class Person(val name: String, val age: Int)

The cell b1 would be = persons.map(Person::name) and b2 = persons.map(Person::age).

This was a super simple example, so let’s add little bit logic, so that we can get little bit meat to return back to the code monkeys.

a b c
1 Name Age
2
[0]
[1]
[2]
Chris
Alex
Zara
35
23
45
3 Average 34.33

We simply use the average() extension function in List to calculate the average:
b3 = b2.average().

We want to return the averageAge as an output from the spreadsheet.

So now we have
Input
persons : List<Person>

Output
averageAge : Double = b3

From the code, we can now make instances of the spreadsheet like this:

    val persons = listOf<Person>(Person("Chris", 35), Person("Alex", 23), Person("Zara", 45))
    val sheet = PersonsSpreadsheet(persons)
    println("Average age is ${sheet.averageAge}")

PersonsSpreadsheet is generated by the spreadsheet software. All the inputs we set (in this case only one), will be arguments to the constructor. The outputs we set will be properties in the class.

Hope this was more understandable! Please share thoughts and opinions, and let me know if something like this already exists.

I am in the process of developing Table API, which will include spreadsheet implementation. It is not the aim to duplicate excell functionality. It is too specific, but it is quite possible to operate with spreadsheets (no GUI though, it is completely different task).

By the way, Pandas is not about spreadsheets, it is about dense tables.

1 Like