Wednesday, November 23, 2022

Learning PureScript - Discipline is Freedom

This post is written based on the book "Functional Programming Made Easier: A Step-by-Step Guide".  

In the 1960s, the Structured Programming paradigm was developed to overcome the Spaghetti Code situation resulting from an overuse of GOTO statements.  Essentially, GOTO via Jump instructions was phased out, but there was a lot of resistance to the new Structured Programming paradigm.  Today, we are trying to replace the Imperative Programming (IP) paradigm with the Functional Programming (FP) paradigm.  There are five reasons for this: global state, mutable state, purity, optimisation, and types.  

The Global State or Global Variables in the IP paradigm allow the change of data at any time anywhere in the program, which makes it difficult to remember or reason about every possible use.  Hence, there has to be tight coupling between Global Variables, but this is easily broken by non-compliant code.  Concurrency is not enforced, and there can be variable name collisions.  Object-Oriented Programming (OOP) does not resolve the problems because we can easily create a Singleton Object that contains all our variables, which can be exposed publicly, in an emulation of Global Variables.  FP makes it impossible to have Global State.  

In IP, variables are mutable, which results in programs that are much harder to reason about because values can change drastically.  This causes programs to be more fragile.  In FP, there are no variables that are mutable.  There are only expressions linking the variable names to a fixed quantity, boolean, string or other variables.  There is referential transparency because these variables can be replaced by its values without changing the program's behaviour.  Loops in IP, which require mutable variables, become recursive functions in FP, which parallel the mathematical definitions and are easy to understand.  

In FP, there is functional purity when functions are pure, because such functions take one or more inputs, perform a computation, and returns a result.  In IP, a result may not necessarily be returned because the functions may be used to perform multiple other tasks, such as printing a value to the screen, which do not require returning a result from the computation.  These other tasks are called Side Effects.  In FP, pure functions have no Side Effects because the same input just heats up the CPU and will always produce the same output without writing to files, displaying to screen, or updating to database.  Functions with multiple parameters can also be rewritten as a set of functions each requiring only one parameter, and all of them Curried together.  

In FP, there is memoisation because the results of difficult calculations can be stored, so that the results can be looked up in their cached values instead of being recalculated.  Hence, there is optimisation.  There is also optimisation because there are no Side Effects in FP, because Side Effects cannot be cached, and when a computation resulting in a Side Effect is performed multiple times in IP, there will be multiple costly Side Effects.  

In FP, types are static because they can be checked at compile time, and hence, type errors can be detected early, refactoring can be done more easily, and IDE's can better support code development.  Hence, there is better code quality and correctness.  Static Types, as compared to Dynamic Types, limit the flexibility in coding and require explicit typing, which will result in more cumbersome code expressions.  However, some FP languages support Type Inference, which frees the programmer from explicitly declaring Types because the compiler can infer the Types based on the usage of the expressions.  Some FP languages mix type definitions with variable names, which can be more confusing.  PureScript separates the type definitions from the variable expressions.  

Monday, November 21, 2022

Locating the source of a problem

To locate the source of a problem, we can either begin where the problem occurs and work towards the source of the bug, or start at the top level of the application and drill down until the buggy source is located.  

When a program crashes and if the error messages indicate a specific problem routine, we can use this troubleshooting method:
This will likely lead to the source of the bug.  This is a process of identifying the bug by following the calls in a sequence of routines to its origins in order to identify the bug.  When a program freezes, this method can also be used by starting from a memory dump.  Memory dumps are possible if there are tools or commands available for this.  Otherwise, we'll have to create a dump of log messages from the entries and exits of our code routines and examine them.  One source of the problem is the libraries we use for our code.  

If the problem is an emergent property which cannot be readily associated with any part of the code, then we'll have to begin at the top level of the code, break down the code into parts, and examine the contribution of each part to the problem individually.  These problems relate to the performance, security and reliability of our code.  


Reference:

Diomidis Spinellis, "Effective Debugging: 66 Specific Ways to Debug Software and Systems"

Carefully examine the data at a routine's entry and exit points

When debugging a routine, we can resolve many problems by first examining its entry point or pre-conditions, which include its program state and input parameters, and then its exit point or post-conditions, which include its program state and return values.  Problems at the entry point indicate that the code leading to the routine is faulty, while problems at the exit point indicate that the routine is faulty.  If there are no problems at entry and exit, then the fault lies not in the routine.  

To debug a sequence of routines, it helps to create log messages for indicating the entries into and exits from the routines and the data at these points.  This is more communicative than the use of breakpoints, although breakpoints are usually easier to use.  It is important to verify and not assume.  


Reference:

Diomidis Spinellis, "Effective Debugging: 66 Specific Ways to Debug Software and Systems"

Monday, November 14, 2022

Focus our Search Queries to Google Search the Web to Find Possible Solutions to our Debug Problem

Summary

We should search the Internet for solutions to error messages by enclosing the messages in double quotes, and include relevant details if needed.  Answers coming from the website Stack Overflow are usually very helpful and useful.  If we can't find answers, we can post our problems on Stack Overflow or open an issue with the relevant developers.  

Currently, the Internet is ubiquitous, so we can use web search engines, such as Google, Yahoo, Bing, etc, to find answers and solutions to our questions and problems.  We can directly paste our debug error message enclosed in double quotes directly into the search engine's search box.  We can also include anything related to the problem, such as the name of the problematic library, middleware, class, method, returned error codes, etc, into the search phrase.  

Solutions to problems associated with invoking APIs can also be found by researching how others use the API.  This can be found in available code already on the Internet, such as open source software, or through searching Black Duck Open Hub Code Search: https://www.openhub.net/.  We find the problematic function in the code example, and investigate how the input parameters are obtained and how the output return value is used.  

A useful website that frequently appears in our search results for debugging problems is Stack Overflow from the StackExchange network.  It contains very relevant coding and debugging questions, answers and discussions.  We should read most posts with more votes, if possible, in a discussion thread, including the comments, not just the post with the highest number of votes.  

When we can't find answers in this way, we can also post our problems on Stack Overflow.  When we post our problems, we should follow the problem description requirements as stated at https://maxloo-coding-debugging.blogspot.com/2022/11/use-issue-tracking-system-for-all.html, namely: "Ideally, a short, complete, and compilable or runnable example should be provided in the issue description."  More information can be about this can be found at: http://sscce.org/, http://www.catb.org/%7Eesr/faqs/smart-questions.html.  For some languages, our problems can also be presented live at: https://jsfiddle.net/, https://www.sourcelair.com/.  

If we find that our problem may have arise from an open source library or program, and we have good reasons to believe there's a bug in that code, we can search that code's related issues for solutions, and open an issue on that code's bug tracking system if similar questions could not be found.  It may even be possible to send a polite, careful and considerate email to the developer of that code if no bug tracking system is available, but we must remember that they aren't paid to support us.  


Reference:

Diomidis Spinellis, "Effective Debugging: 66 Specific Ways to Debug Software and Systems"

Sunday, November 13, 2022

Use an Issue Tracking System for All Problems

Summary

All problems should be handled by an issue tracking system.  Each issue should have a detailed description about how to reproduce the problem, and a short, complete, and compilable or runnable example.  Our debugging work should be scheduled based on the priority and severity of each issue.  Our progress in handling the issues should be documented by the issue tracking system.  

Based on ISO-24765-2010, a fault, also known as a defect or bug, is "an incorrect step, process, or data definition in a computer program."  And a failure that results when a fault is encountered, is defined as "an event in which a system or system component does not perform a required function within specified limits."  A problem can either be a fault, which is a problem in the code, or a failure, which is a reproducible problem.  Routines are callable units of code, which include member functions, methods, functions, procedures, and subroutines.

When setting out to fix a problem, choose the most appropriate strategy to succeed with less effort.  If it doesn't work, choose the next most appropriate.  

GitHub and GitLab provide basic issue tracking functionalities in their systems.  Other open source options include: Bugzilla, Launchpad, OTRS, Redmine, or Trac.  Such systems must be used to file all problems because:
* they provide visibility to debugging efforts
* they enable the tracking and planning of releases
* they facilitate prioritisation of work items
* they help document common problems and solutions
* they ensure that no problems will be missed
* they allow automatic generation of release notes
* they serve as repositories for measuring, reflecting on, and learning from these defects

Problems not recorded in these systems as issues should not be handled.  Some organisations do not permit changes to source code, unless the change is associated with an issue.  Each issue should have a detailed description of how to reproduce it.  Ideally, a short, complete, and compilable or runnable example should be provided in the issue description.  Each issue or bug report should also have a precise title, details about the bug's priority and severity, a list of the affected stakeholders, and essentials about its environment configuration or situation.  We should also document our progress in handling the issues by appending comments to each issue entry in the issue tracking system we use.  To ensure transparency, all steps taken to investigate and fix the bug should be recorded, including dead ends and the precise commands to log or trace the program's behaviour.  


Reference:

Diomidis Spinellis, "Effective Debugging: 66 Specific Ways to Debug Software and Systems"

Mirdin Coding Tips

Tradeoffs for Pre and Post Conditions for Code Blocks A weaker precondition or less preconditions for a code block will be more general and ...